A Survey on Distribution Testing: Your Data is Big. But is it Blue?

by Clément L. Canonne

Theory of Computing, Graduate Surveys 9, pp. 1-100, 2020

Bibliography with links to cited articles

[1]   Jayadev Acharya, Clément L. Canonne, and Gautam Kamath: A chasm between identity and equivalence testing with conditional queries. Theory of Computing, (19):1–46, 2018. Preliminary version in RANDOM’15. [doi:10.4086/toc.2018.v014a019, arXiv:1411.7346]

[2]   Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, and Shengjun Pan: Competitive closeness testing. In Proc. 24th Ann. Conf. on Learning Theory (COLT’11), pp. 47–68, 2011. Accessible at http://proceedings.mlr.press/v19/acharya11a.html.

[3]   Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, Shengjun Pan, and Ananda Theertha Suresh: Competitive classification and closeness testing. In Proc. 25th Ann. Conf. on Learning Theory (COLT’12), volume 23, pp. 22.1–22.18, 2012.

[4]   Jayadev Acharya and Constantinos Daskalakis: Testing Poisson Binomial Distributions. In Proc. 26th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’15), pp. 1829–1840. ACM Press, 2015. [doi:10.1137/1.9781611973730.122, arXiv:1410.3386]

[5]   Jayadev Acharya, Constantinos Daskalakis, and Gautam C. Kamath: Optimal testing for properties of distributions. In Proc. Advances in Neural Information Processing Systems (NIPS’15), pp. 3577–3598, 2015. Available in NIPS’15. [arXiv:1507.05952]

[6]   Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, and Ananda Theertha Suresh: A competitive test for uniformity of monotone distributions. In Proc. 16th Internat. Conf. on Artificial Intelligence and Stat. (AISTATS’13), pp. 57–65, 2013. Available in AISTATS’13.

[7]   Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, and Ananda Theertha Suresh: Sublinear algorithms for outlier detection and generalized closeness testing. In IEEE Internat. Symp. Information Theory (ISIT’14), pp. 3200–3204. IEEE Comp. Soc. Press, 2014. [doi:10.1109/ISIT.2014.6875425]

[8]   Jayadev Acharya, Alon Orlitsky, Ananda Theertha Suresh, and Himanshu Tyagi: Estimating Rényi entropy of discrete distributions. IEEE Trans. Inform. Theory, 63(1):38–56, 2017. Preliminary version in SODA’15. [doi:10.1109/TIT.2016.2620435, arXiv:1408.1000]

[9]   Michał Adamaszek, Artur Czumaj, and Christian Sohler: Testing monotone continuous distributions on high-dimensional real cubes. In [56]. 2010. Preliminary version in SODA’10. [doi:10.1007/978-3-642-16367-8_13]

[10]   José A. Adell and Pedro Jodrá: Exact Kolmogorov and total variation distances between some familiar discrete distributions. J. Inequalities and Appl., 2006:64307, 2006. [doi:10.1155/JIA/2006/64307]

[11]   Maryam Aliakbarpour, Eric Blais, and Ronitt Rubinfeld: Learning and testing junta distributions. In Proc. 29th Ann. Conf. on Learning Theory (COLT’16), volume 49 of JMLR Workshop and Conference Proceedings, pp. 19–46. Springer, 2016. Available in COLT’16.

[12]   Noga Alon, Alexandr Andoni, Tali Kaufman, Kevin Matulef, Ronitt Rubinfeld, and Ning Xie: Testing k-wise and almost k-wise independence. In Proc. 39th STOC, pp. 496–505. ACM Press, 2007. [doi:10.1145/1250790.1250863]

[13]   Mark Yuying An: Log-concave probability distributions: theory and statistical testing. Technical Report Working Paper No. 95-03, Duke University Department of Economics, 1997. Preliminary version available at repec.org. [doi:10.2139/ssrn.1933]

[14]   Patrice Assouad: Deux remarques sur l’estimation. C. R. Acad. Sci. Paris Ser. I Math., 296(23):1021–1024, 1983.

[15]   Ziv Bar-Yossef: The Complexity of Massive Data Set Computations. Ph. D. thesis, UC Berkeley, 2002. Adviser: Christos Papadimitriou. Available at  http://webee.technion.ac.il/people/zivby/index\_files/Page1489.html.

[16]   Tuğkan Batu, Sanjoy Dasgupta, Ravi Kumar, and Ronitt Rubinfeld: The complexity of approximating the entropy. SIAM J. Comput., 35(1):132–150, 2005. Preliminary version in CCC’02. [doi:10.1137/S0097539702403645]

[17]   Tuğkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White: Testing random variables for independence and identity. In Proc. 42nd FOCS, pp. 442–451. IEEE Comp. Soc. Press, 2001. [doi:10.1109/SFCS.2001.959920]

[18]   Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White: Testing that distributions are close. In Proc. 41st FOCS, pp. 189–197. IEEE Comp. Soc. Press, 2000. This is the preliminary version of  [19]. [doi:10.1109/SFCS.2000.892113]

[19]   Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White: Testing closeness of discrete distributions. J. ACM, 60(1):4:1–4:25, 2013. This is the journal version of [18]. [doi:10.1145/2432622.2432626]

[20]   Tuğkan Batu, Ravi Kumar, and Ronitt Rubinfeld: Sublinear algorithms for testing monotone and unimodal distributions. In Proc. 36th STOC, pp. 381–390. ACM Press, 2004. [doi:10.1145/1007352.1007414]

[21]   Piotr Berman, Sofya Raskhodnikova, and Grigory Yaroslavtsev: Lp-testing. In Proc. 46th STOC, pp. 164–173. ACM Press, 2014. [doi:10.1145/2591796.2591887]

[22]   Bhaswar Bhattacharya and Gregory Valiant: Testing closeness with unequal sized samples. In Proc. Advances in Neural Information Processing Systems (NIPS’15), pp. 2593–2601, 2015. Available in NIPS’15. [arXiv:1504.04599]

[23]   Arnab Bhattacharyya, Eldar Fischer, Ronitt Rubinfeld, and Paul Valiant: Testing monotonicity of distributions over general partial orders. In Proc. 2nd Symp. Innovations in Computer Science (ICS’11), pp. 239–252, 2011. Available at ICS’11.

[24]   Lucien Birgé: On the risk of histograms for estimating decreasing densities. Ann. Stat., 15(3):1013–1022, 1987. [doi:10.1214/aos/1176350489]

[25]   Eric Blais, Joshua Brody, and Kevin Matulef: Property testing lower bounds via communication complexity. Comput. Complexity, 21(2):311–358, 2012. Preliminary version in CCC’11. [doi:10.1007/s00037-012-0040-x]

[26]   Eric Blais, Clément L. Canonne, and Tom Gur: Distribution testing lower bounds via reductions from Communication Complexity. ACM Trans. Comput. Theory, 11(2):6:2–6:37, 2019. Preliminary version in CCC’17. [doi:10.1145/3305270, ECCC:TR16-168]

[27]   Cafer Caferov, Bariş Kaya, Ryan O’Donnell, and A. C. Cem Say: Optimal bounds for estimating entropy with PMF queries. In Proc. 40th Math. Found. Comp. Sci. (MFCS’15), pp. 187–198. Springer, 2015. [doi:10.1007/978-3-662-48054-0_16]

[28]   Clément L. Canonne: Big Data on the rise? Testing monotonicity of distributions. In Proc. 42nd Internat. Colloq. on Automata, Languages and Programming (ICALP’15), volume 9134 of Lecture Notes in Computer Science, pp. 294–305. Springer, 2015. [doi:10.1007/978-3-662-47672-7_24]

[29]   Clément L. Canonne: Are Few Bins Enough: Testing Histogram Distributions. In Proc. 35th Symp. on Principles of Database Systems (PODS’16), pp. 455–463. ACM Press, 2016. [doi:10.1145/2902251.2902274, ECCC:TR15-160]

[30]   Clément L. Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld: Testing shape restrictions of discrete distributions. Theory Comput. Syst., 62(1):4–62, 2018. Preliminary version in STACS’16. [doi:10.1007/s00224-017-9785-6, arXiv:1507.03558]

[31]    Clément L. Canonne, Themis Gouleakis, and Ronitt Rubinfeld: Sampling correctors. SIAM J. Comput., 47(4):1373–1423, 2018. Preliminary version in ITCS’16. [doi:10.1137/16M1076666, arXiv:1504.06544]

[32]   Clément L. Canonne, Dana Ron, and Rocco A. Servedio: Unpublished, 2013.

[33]   Clément L. Canonne, Dana Ron, and Rocco A. Servedio: Testing equivalence between distributions using conditional samples. In Proc. 25th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’14), pp. 1174–1192. ACM Press, 2014. See also [34] (full version). [doi:10.1137/1.9781611973402.87]

[34]   Clément L. Canonne, Dana Ron, and Rocco A. Servedio: Testing probability distributions using conditional samples. SIAM J. Comput., 44(3):540–616, 2015. [doi:10.1137/130945508, ECCC:TR12-155, arXiv:1211.2664]

[35]   Clément L. Canonne and Ronitt Rubinfeld: Testing probability distributions underlying aggregated data. In Proc. 41st Internat. Colloq. on Automata, Languages and Programming (ICALP’14), pp. 283–295. Springer, 2014. [doi:10.1007/978-3-662-43948-7_24, ECCC:TR14-021, arXiv:1402.3835]

[36]   Sourav Chakraborty, Eldar Fischer, Yonatan Goldhirsh, and Arie Matsliah: On the power of conditional samples in distribution testing. SIAM J. Comput., 45(4):1261–1296, 2016. Preliminary version in ITCS’13. [doi:10.1137/140964199]

[37]   Siu-on Chan, Ilias Diakonikolas, Rocco A. Servedio, and Xiaorui Sun: Learning mixtures of structured distributions over discrete domains. In Proc. 24th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’13), pp. 1380–1394. ACM Press, 2013. [doi:10.1137/1.9781611973105.100, arXiv:1210.0864]

[38]   Siu-On Chan, Ilias Diakonikolas, Rocco A. Servedio, and Xiaorui Sun: Efficient density estimation via piecewise polynomial approximation. In Proc. 46th STOC, pp. 604–613. ACM Press, 2014. [doi:10.1145/2591796.2591848, arXiv:1305.3207]

[39]   Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant: Optimal algorithms for testing closeness of discrete distributions. In Proc. 25th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’14), pp. 1193–1203. ACM Press, 2014. [doi:10.1137/1.9781611973402.88, arXiv:1308.3946]

[40]   Constantinos Daskalakis, Ilias Diakonikolas, Ryan O’Donnell, Rocco A. Servedio, and Li-Yang Tan: Learning sums of independent integer random variables. In Proc. 54th FOCS, pp. 217–226. IEEE Comp. Soc. Press, 2013. [doi:10.1109/FOCS.2013.31]

[41]   Constantinos Daskalakis, Ilias Diakonikolas, and Rocco A. Servedio: Learning k-modal distributions via testing. Theory of Computing, 10(20):535–570, 2014. Preliminary version in SODA’12. [doi:10.4086/toc.2014.v010a020, arXiv:1107.2700]

[42]   Constantinos Daskalakis, Ilias Diakonikolas, and Rocco A. Servedio: Learning Poisson binomial distributions. Algorithmica, 72(1):316–357, 2015. Preliminary version in STOC’12. [doi:10.1007/s00453-015-9971-3, arXiv:1107.2702]

[43]   Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio, Gregory Valiant, and Paul Valiant: Testing k-modal distributions: Optimal algorithms via reductions. In Proc. 24th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’13), pp. 1833–1852. ACM Press, 2013. [doi:10.1137/1.9781611973105.131]

[44]   Yuxin Deng and Wenjie Du: The Kantorovich metric in computer science: A brief survey. Electron. Notes Theor. Comput. Sci., 253(3):73–82, 2009. Preliminary version in QAPL’09. [doi:10.1016/j.entcs.2009.10.006]

[45]   Luc Devroye and Gábor Lugosi: Combinatorial Methods in Density Estimation. Springer Series in Statistics. Springer, 2001. [doi:10.1007/978-1-4613-0125-7]

[46]   Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price: Collision-based testers are optimal for uniformity and closeness. Chicago J. Theoret. Comp. Sci., (1):1–21, 2019. [doi:10.4086/cjtcs.2019.001, ECCC:TR16-178, arXiv:1611.03579]

[47]   Ilias Diakonikolas and Daniel M. Kane: A new approach for testing properties of discrete distributions. In Proc. 57th FOCS, pp. 685–694. IEEE Comp. Soc. Press, 2016. [doi:10.1109/FOCS.2016.78, arXiv:1601.05557]

[48]   Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin: Optimal algorithms and lower bounds for testing closeness of structured distributions. In Proc. 56th FOCS, pp. 1183–1202. IEEE Comp. Soc. Press, 2015. [doi:10.1109/FOCS.2015.76, arXiv:1508.05538]

[49]   Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin: Testing identity of structured distributions. In Proc. 26th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’15), pp. 1841–1854. ACM Press, 2015. [doi:10.1137/1.9781611973730.123, arXiv:1410.2266]

[50]   Khanh Do Ba, Huy L. Nguyen, Huy N. Nguyen, and Ronitt Rubinfeld: Sublinear time algorithms for Earth Mover’s distance. Theory Comput. Syst., 48(2):428–442, 2011. [doi:10.1007/s00224-010-9265-8, arXiv:0904.0292]

[51]   Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz: Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann. Math. Statist., 27(3):642–669, 1956. [doi:10.1214/aoms/1177728174]

[52]   Moein Falahatgar, Ashkan Jafarpour, Alon Orlitsky, Venkatadheeraj Pichapathi, and Ananda Theertha /Suresh: Faster algorithms for testing under conditional sampling. In Proc. 28th Ann. Conf. on Learning Theory (COLT’15), JMLR Proceedings, pp. 607–636, 2015. Available at MLR. [arXiv:1504.04103]

[53]   Eldar Fischer: The art of uninformed decisions: A primer to property testing. Bull. EATCS, 75:97–126, 2001.

[54]   Alison L. Gibbs and Francis Edward Su: On choosing and bounding probability metrics. Interdisciplinary Science Reviews, 70(3):419–435, 2002. [doi:10.1111/j.1751-5823.2002.tb00178.x, arXiv:math/0209021]

[55]   Peter W. Glynn: Upper bounds on Poisson tail probabilities. Oper. Res. Lett., 6(1):9–14, 1987. [doi:10.1016/0167-6377(87)90003-4]

[56]   Oded Goldreich, editor. Property Testing: Current Research and Surveys. Springer, 2010. LNCS 6390. [doi:10.1007/978-3-642-16367-8]

[57]   Oded Goldreich: On the communication complexity methodology for proving lower bounds on the query complexity of property testing. Electron. Colloq. on Comput. Complexity (ECCC), 20:73, 2013. [ECCC:TR13-073]

[58]   Oded Goldreich: The uniform distribution is complete with respect to testing identity to a fixed distribution. In Computational Complexity and Property Testing (O. Goldreich, ed.), pp. 152–172. Springer, 2020. [doi:10.1007%2F978-3-030-43662-9_10, ECCC:TR16-015]

[59]   Oded Goldreich, Shafi Goldwasser, and Dana Ron: Property testing and its connection to learning and approximation. J. ACM, 45(4):653–750, 1998. Preliminary version in FOCS’96. [doi:10.1145/285055.285060, ECCC:TR96-057]

[60]   Oded Goldreich and Dana Ron: On testing expansion in bounded-degree graphs. In [56], pp. 68–75. 2010. Preliminary version ECCC TR00-020 (2000). [doi:10.1007/978-3-642-22670-0_9]

[61]   Oded Goldreich and Dana Ron: On sample-based testers. ACM Trans. Comput. Theory, 8(2):7:1–7:54, 2016. Preliminary version in ITCS’15. [doi:10.1145/2898355, ECCC:TR13-109]

[62]   Oded Goldreich and Salil P. Vadhan: On the complexity of computational problems regarding distributions. In [56], pp. 390–405. 2011. [doi:10.1007/978-3-642-22670-0_27, ECCC:TR11-004]

[63]   Sudipto Guha, Andrew McGregor, and Suresh Venkatasubramanian: Streaming and sublinear approximation of entropy and information distances. In Proc. 17th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’06), pp. 733–742. ACM Press, 2006. Available at ACM DL. [arXiv:cs/0508122]

[64]   Sariel Har-Peled: Lower bound on estimating k=1nak for non-increasing (ak)k. Theoretical Computer Science Stack Exchange, 2015. http://cstheory.stackexchange.com/q/28024 (version: 2015-01-01).

[65]   Piotr Indyk, Reut Levi, and Ronitt Rubinfeld: Approximating and testing k-histogram distributions in sub-linear time. In Proc. 31st Symp. on Principles of Database Systems (PODS’12), pp. 15–22. ACM Press, 2012. [doi:10.1145/2213556.2213561, ECCC:TR11-171]

[66]   Jiantao Jiao, Yanjun Han, and Tsachy Weissman: Minimax estimation of the L1 distance. In IEEE Internat. Symp. Information Theory (ISIT’16), pp. 750–754. IEEE Comp. Soc. Press, 2016. [doi:10.1109/ISIT.2016.7541399, arXiv:1705.00807]

[67]   Jiantao Jiao, Kartik Venkat, Yanjun Han, and Tsachy Weissman: Minimax estimation of functionals of discrete distributions. IEEE Trans. Inform. Theory, pp. 2835–2885, 2015. [doi:10.1109/TIT.2015.2412945, arXiv:1406.6956]

[68]   Leonid V. Kantorovich and Gennady S. Rubinstein: On a space of totally additive functions. Vestnik, Leningrad University, 13(7):52–59, 1958.

[69]   Shang Keng Ma: Calculation of entropy from data of motion. J. Stat. Phys., 26(2):221–240, 1981. [doi:10.1007/BF01013169]

[70]   Lucien Le Cam: An approximation theorem for the Poisson binomial distribution. Pacific J. Math., 10(4):1181–1197, 1960. [doi:10.2140/pjm.1960.10.1181]

[71]   Lucien Le Cam: Convergence of estimates under dimensionality restrictions. Ann. Stat., 1(1):38–53, 1973. [doi:10.1214/aos/1193342380]

[72]   Lucien Le Cam: Asymptotic Methods in Statistical Decision Theory. Springer Series in Statistics. Springer, 1986. [doi:10.1007/978-1-4612-4946-7]

[73]   Reut Levi, Dana Ron, and Ronitt Rubinfeld: Testing properties of collections of distributions. Theory of Computing, 9(8):295–347, 2013. Preliminary version in ICS’11. [doi:10.4086/toc.2013.v009a008, ECCC:TR10-157]

[74]   Reut Levi, Dana Ron, and Ronitt Rubinfeld: Testing similar means. SIAM J. Discrete Math., 28(4):1699–1724, 2014. Preliminary version in ICALP’12. [doi:10.1137/120903737, ECCC:TR12-055]

[75]   Pascal Massart: The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality. Ann. Probab., 18(3):1269–1283, 1990. [doi:10.1214/aop/1176990746]

[76]   Ashley Montanaro and Ronald de Wolf: A Survey of Quantum Property Testing. Theory of Computing, 2016. Graduate Surveys, 7. [doi:10.4086/toc.gs.2016.007, arXiv:1310.2035]

[77]   Ryan O’Donnell and John Wright: Quantum spectrum testing. In Proc. 47th STOC, pp. 529–538. ACM Press, 2015. [doi:10.1145/2746539.2746582, arXiv:1501.05028]

[78]   Liam Paninski: Estimating entropy on m bins given fewer than m samples. IEEE Trans. Inform. Theory, 50(9):2200–2203, 2004. [doi:10.1109/TIT.2004.833360]

[79]   Liam Paninski: A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Trans. Inform. Theory, 54(10):4750–4755, 2008. [doi:10.1109/TIT.2008.928987]

[80]   Michal Parnas, Dana Ron, and Ronitt Rubinfeld: Tolerant property testing and distance approximation. J. Comput. System Sci., 72(6):1012–1042, 2006. [doi:10.1016/j.jcss.2006.03.002, ECCC:TR04-010]

[81]   Siméon Denis Poisson: Recherches sur la probabilité des jugements en matière criminelle et en matière civile: précédées des règles générales du calcul des probabilités. Bachelier, 1837. Available at gallica.bnf.fr.

[82]   David Pollard: Asymptopia, 2003–2017. Manuscript.

[83]   Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam D. Smith: Strong lower bounds for approximating distributions support size and the distinct elements problem. SIAM J. Comput., 39(3):813–842, 2009. Preliminary version in FOCS’07. [doi:10.1137/070701649]

[84]   Laurence Reboul: Estimation of a function under shape restrictions. applications to reliability. Ann. Math. Statist., 33(3):1330–1356, 2005. [doi:10.1214/009053605000000138]

[85]   Leo Reyzin: Extractors and the leftover hash lemma, 2011. Lecture notes accessible at http://www.cs.bu.edu/ reyzin/teaching/s11cs937/notes-leo-1.pdf.

[86]   Dana Ron: Property Testing: A Learning Theory Perspective. Found. Trends Machine Learning, 1(3):307–402, 2008. Preliminary version in COLT’07. [doi:10.1561/2200000004]

[87]   Dana Ron: Algorithmic and analysis techniques in property testing. Found. Trends Theor. Comput. Sci., 5(2):73–205, 2010. [doi:10.1561/0400000029]

[88]   Ronitt Rubinfeld: Taming big probability distributions. ACM Crossroads, 19(1):24–28, 2012. [Unofficial French translation by C. Canonne]. [doi:10.1145/2331042.2331052]

[89]   Ronitt Rubinfeld and Rocco A. Servedio: Testing monotone high-dimensional distributions. Random Struct. Algorithms, 34(1):24–44, 2009. Preliminary version in STOC’05. [doi:10.1002/rsa.20247]

[90]   Ronitt Rubinfeld and Madhu Sudan: Robust characterization of polynomials with applications to program testing. SIAM J. Comput., 25(2):252–271, 1996. [doi:10.1137/S0097539793255151]

[91]   Ronitt Rubinfeld and Ning Xie: Testing non-uniform k-wise independent distributions over product spaces. In Proc. 37th Internat. Colloq. on Automata, Languages and Programming (ICALP’10), pp. 565–581. Springer, 2010. [doi:10.1007/978-3-642-14165-2_48]

[92]   Amit Sahai and Salil Vadhan: A complete problem for statistical zero knowledge. J. ACM, 50(2):196–249, 2003. Preliminary version in FOCS’97. [doi:10.1145/636865.636868, ECCC:TR00-084]

[93]   Henry Scheffé: A useful convergence theorem for probability distributions. Ann. Math. Statist., 18(3):434–438, 1947. [doi:10.1214/aoms/1177730390]

[94]    Wojciech Szpankowski: Analytic Poissonization and Depoissonization. In Average Case Analysis of Algorithms on Sequences, Ch. 10, pp. 442–519. John Wiley & Sons, Inc., 2001. [doi:10.1002/9781118032770.ch10]

[95]   Gregory Valiant: Algorithmic Approaches to Statistical Questions. Ph. D. thesis, UC Berkeley EECS, 2012. Adviser: Christos Papadimitriou.

[96]   Gregory Valiant and Paul Valiant: A CLT and tight lower bounds for estimating entropy. Electron. Colloq. on Comput. Complexity (ECCC), 17:179, 2010. [ECCC:TR10-179]

[97]   Gregory Valiant and Paul Valiant: Estimating the unseen: A sublinear-sample canonical estimator of distributions. Electron. Colloq. on Comput. Complexity (ECCC), 17:180, 2010. See also [98]. [ECCC:TR10-180]

[98]   Gregory Valiant and Paul Valiant: The power of linear estimators. In Proc. 52nd FOCS, pp. 403–412. IEEE Comp. Soc. Press, 2011. See also [96] and [97]. [doi:10.1109/FOCS.2011.81]

[99]   Gregory Valiant and Paul Valiant: An automatic inequality prover and instance optimal identity testing. SIAM J. Comput., 46(1):429–455, 2017. Preliminary version in FOCS’14. [doi:10.1137/151002526]

[100]   Paul Valiant: Testing symmetric properties of distributions. SIAM J. Comput., 40(6):1927–1968, 2011. Preliminary version in STOC’08. [doi:10.1137/080734066]

[101]   Bo Waggoner: p testing and learning of discrete distributions. In Proc. 6th Symp. Innovations in Theoretical Computer Science (ITCS’15), pp. 347–356. ACM Press, 2015. [doi:10.1145/2688073.2688095, arXiv:1412.2314]

[102]   Guenther Walther: Inference and modeling with log-concave distributions. Statistical Science, 24(3):319–327, 2009. [doi:10.1214/09-STS303]

[103]   Yihong Wu and Pengkun Yang: Minimax rates of entropy estimation on large alphabets via best polynomial approximation. IEEE Trans. Inform. Theory, 62(6):3702–3720, 2016. [doi:10.1109/TIT.2016.2548468, arXiv:1407.0381]

[104]   Bin Yu: Assouad, Fano, and Le Cam. In Festschrift for Lucien Le Cam, pp. 423–435. Springer, 1997. [doi:10.1007/978-1-4612-1880-7_29]