Da Kuang

Postdoc
Department of Mathematics, UCLA
Email:   first name    last name    at    math.ucla.edu
CV [pdf] (Google Scholar)


Welcome!

I am a postdoc in applied mathematics at UCLA, working with Prof. Andrea Bertozzi. My research area is numerical methods for large-scale machine learning.

I received my PhD degree in Computational Science and Engineering at Georgia Tech, advised by Prof. Haesun Park. My thesis topic is nonnegative matrix factorization (NMF) for clustering.

I created an algorithm based on hierarchical rank-2 NMF for large-scale topic modeling that is about 20 times faster than latent Dirichlet allocation and 100 times faster than the original NMF with comparable quality. The algorithm is now available as an open-source software called smallk (also with Matlab code and example).

Previously, I obtained my Bachelor degree in computer science at Tsinghua University in Beijing, China. I started my college years in the Department of Mathematics, and later transferred to the Department of Computer Science and joined Yao Class. I worked with Prof. Min Zhang and Dr. Tao Qin on learning to rank algorithms for information retrieval.

Publications

Wei Zhu, Victoria Chayes, Alexandre Tiard, Stephanie Sanchez, Devin Dahlberg, Andrea Bertozzi, Stanley Osher, Dominique Zosso, and Da Kuang. Unsupervised classification in hyperspectral imagery with nonlocal total variation and primal-dual hybrid gradient algorithm, IEEE Transactions on Geoscience and Remote Sensing, 55(5):2786-2798, 2017. [arXiv] [link]

Da Kuang, Alex Gittens, and Raffay Hamid, Hardware compliant approximate image codes, Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR '15), pp. 924-932, Boston, MA, 2015. [pdf]

Da Kuang, Jaegul Choo, and Haesun Park, Nonnegative matrix factorization for interactive topic modeling and document clustering (book chapter), in Partitional Clustering Algorithms, Springer, 2015. [pdf] [link]

Nicolas Gillis, Da Kuang, and Haesun Park, Hierarchical clustering of hyperspectral images using rank-two nonnegative matrix factorization, IEEE Transactions on Geoscience and Remote Sensing, 53(4):2066-2078, 2015. [arXiv] [link]

Da Kuang, Sangwoon Yun, and Haesun Park, SymNMF: Nonnegative low-rank approximation of a similarity matrix for graph clustering, Journal of Global Optimization, 62(3):545-574, 2015. [pdf] [link]

Da Kuang and Haesun Park, Fast rank-2 nonnegative matrix factorization for hierarchical document clustering, Proceedings of the 19th ACM SIGKDD International Conference on Knowledge, Discovery, and Data Mining (KDD '13), pp. 739-747, Chicago, IL, 2013. [pdf] [Matlab code] [C++ code on github]

Da Kuang, Chris Ding, and Haesun Park, Symmetric nonnegative matrix factorization for graph clustering, Proceedings of 2012 SIAM International Conference on Data Mining (SDM '12), pp. 106-117, Anaheim, CA, 2012. [pdf] [slides] [code]

Min Zhang, Da Kuang, Guichun Hua, Yiqun Liu, and Shaoping Ma, Is learning to rank effective for web search?, SIGIR 2009 Workshop on Learning to Rank for Information Retrieval, Boston, MA, 2009. [pdf]

Preprints

Da Kuang, Jeffrey Brantingham, and Andrea Bertozzi, Crime topic modeling. [arXiv]

Da Kuang, Zuoqiang Shi, Stanley Osher, and Andrea Bertozzi, A harmonic extension approach for collaborative ranking. [arXiv]

Da Kuang, Alex Gittens, and Raffay Hamid, piCholesky: Polynomial interpolation of multiple Cholesky factors for efficient approximate cross-validation. [arXiv]

Da Kuang, Barry Drake, and Haesun Park, Fast Clustering and Topic Modeling Based on Rank-2 Nonnegative Matrix Factorization. [arXiv]

Teaching

I will not be teaching MATH 156 in Spring 2017 quarter.

Winter 2017: (UCLA) MATH 191 - Numerical Linear Algebra for Data Analysis

Fall 2016: (UCLA) MATH 151B - Applied Numerical Methods (II)

Summer 2016: (Tsinghua) Numerical Methods in Machine Learning

Spring 2016: (UCLA) MATH 285J - Graduate Seminar: Machine Learning

Winter 2016: (UCLA) MATH 191 - Numerical Linear Algebra for Data Analysis

Fall 2015: (UCLA) MATH 151A - Applied Numerical Methods (I)

Fall 2014: (Georgia Tech) CSE 6040 - Computing for Data Analysis: Methods and Tools

Software

Hierarchical Rank-2 NMF for document clustering and topic discovery

Symmetric NMF

kmeans3: Accelerating Matlab K-means with Simple Patches