PhD University of Florence, 2001
Postdoc Bocconi University, 2001-2007
Postdoc Sapienza University, 2009-2010, 2011-2012, 2013-2014
TA and Lecturer, undergraduate and graduate Statistics and Probability courses, Bocconi, Sapienza, Trento, Tor Vergata University
Research topics
Bayesian nonparametrics, exchangeable Gibbs partitions, Gnedin-Pitman's priors, stick-breaking constructions, Monte Carlo estimation, diversity estimation, Bayesian mixtures modelling.
Contact
annalisa.cerquetti at gmail.com
News
Publications
downloadablee-mail me to receive a copy
1. Bayesian Shannon entropy estimation under normalized inverse Gaussian priors via Monte Carlo sampling
Submitted to CLADAG Meeting 2023.
An analytical solution to Bayesian Shannon entropy estimation under general Gibbs-type priors
has been devised in 2014 as a limiting case of Bayesian Tsallis entropy estimation.
Here we propose a different approach and derive a Monte Carlo solution under normalized
Inverse Gaussian prior relying on known results for its stick-breaking representation.
2. On the first two size-biased picks from the normalized inverse Gaussian prior
In 36th International Workshop on Statistical Modelling (IWSM), Trieste, Italy, 18-22 July
2022.
The normalized inverse Gaussian random discrete distribution has
been deeply investigated in Bayesian nonparametrics as one of the possible tractable
alternatives to the Ferguson-Dirichlet prior and to its two-parameter Pitman-Yor
extension. Here we devise an easy to sample representation of its first two size-
biased picks and discuss a potential application to prior calibration.
3. Exact Good-Turing characterization of the two-parameter Poisson Dirichlet superpopulation model
(2019) arXiv:1404.3441 [math.ST]
We improve on a result in Biometrics (2016) by showing that, for any finite sample size, when the population frequencies are assumed to be selected from a superpopulation with two-parameter Poisson-Dirichlet distribution, then Bayesian nonparametric estimation of the discovery probabilities corresponds to Good-Turing exact estimation. Moreover under general superpopulation hypothesis the Good-Turing solution admits an interpretation as a modern Bayesian nonparametric estimator under partial information.
4. Bayesian estimation of Gini-Simpson's index under mainland-island community structure
(2019) In Contributions to Theoretical and Applied Statistics. In Honour of Corrado Gini.
The mainland-island community structure converges in
the large population limit to the Hierarchical Dirichlet process. This finding provides
the analogous, in the multipopulation setting, of the Ewens sampling formula for the
single population neutral hypothesis. Here we derive a BNP
estimator of Gini-Simpson's index under the Hubbell Unified Neutral
Theory of Biodiversity and Biogeography.
5. Bayesian nonparametric estimation of Patil-Taillie-Tsallis diversity under Gnedin-Pitman priors
(2014) arXiv:1404.3441 [math.ST]
We present a fully general Bayesian nonparametric estimation of the whole class of Tsallis diversity indices under Gnedin-Pitman priors, a large family of random discrete distributions recently deeply investigated in posterior predictive species richness and discovery probability estimation. We provide both prior and posterior analysis.
6. Bayesian nonparametric estimation of a generalized diversity index
(2015) SIS Treviso 2015 The Legacy of Corrado Gini - Proceedings
We obtain the posterior analysis of Tsallis generalized diversity index under Gnedin-Pitman priors, a large family of random discrete distributions generalizing the characteristic structure of the Dirichlet process priors.
7. Yet another application of marginals of multivariate Gibbs distributions
(2013) arXiv:1312.5789
We give yet another example of the usefulness of working with marginals of multivariate Gibbs distributions (Cerquetti, 2013) in deriving Bayesian nonparametric estimators under Gibbs priors in species sampling problems. Here in particular we substantially reduce length and complexity of the proofs in Bacallado et al. (2013, Th. 1, and Th. 2) for looking backward probabilities under incomplete information.
8. Bayesian nonparametric estimation of species diversity under Pitman-Yor process priors
(2013). Contributed paper to S.Co. 2013 - Milano, Italy. September 2013. (joint work with S. Poppe)
We propose a Bayesian nonparametric approach to evenness estimation in the same spirit of richness
nonparametric estimation as proposed in Lijoi et al. (2007, 2008). In particular we derive a Bayesian
estimation of the Simpson's index under the hypothesis that the unknown relative abundances of different
species have prior distribution belonging to the Pitman-Yor family. Our technique mostly relies
on Pitman's results for the prior moments of the sum of m-th powers of the random frequencies of the
chosen species sampling model and on specific posterior properties of the two-parameter model.
9. Bayesian nonparametric estimation of global disclosure risk
Contributed paper at CLADAG 2013 - Modena, Italy. September 2013
In disseminating microdata a classical measure of global disclosure risk
is the number of sample uniques that are also population uniques. A Bayesian estimation
of global risk of disclosure, based on frequencies of frequencies under the
superpopulation approach, has been first introduced in Samuels (1998). Here we
propose a Bayesian nonparametric solution under two-parameter Poisson-Dirichlet
priors on the relative abundances of cross-classifications in the total population. We
rely on recent results for posterior prediction estimation of rare species richness under
Gibbs priors, a large class of models generalizing the partition structure of the
Dirichlet process prior
10. A note on a Bayesian nonparametric estimator of the discovery probability
(2013) arXiv:1304.1030 [math.ST]
We provide the correct formulas for a novel Bayesian nonparametric estimator of the probability of detecting at the $(n+m+1)$th observation a species already observed with any given frequency in an enlarged sample of size $n+m$, conditionally on a basic sample of size $n$ and the corresponding explicit result under $(\alpha, \theta)$ Poisson-Dirichlet priors, by means of a new technique devised in Cerquetti (2013).
11. Marginals of multivariate Gibbs distributions with applications in Bayesian species sampling
Electronic Journal of Statistics (2013), 7, 697-716.
We call into question the current approach to Bayesian nonparametric estimation in species sampling problems under Gibbs priors. We derive correct multivariate distributions and show that relyling on those, results for corresponding sampling formulas can be obtained, generalized and sometimes fixed, working with marginals and a known result on falling factorial moments of a sum of non independent indicators. We provide an application of our findings to a recently proposed Bayesian nonparametric estimation under Gibbs priors of the predictive probability to observe a species already observed a certain number of times.
12. Some contributions to the theory of conditional Gibbs partitions
In Complex Models and Computational Methods in Statistics, Series: Contributions to Statistics, 2013, pp 77-89.
We focus on the subclass of Poisson-Kingman partitions driven by the Stable subordinator, and, relying on the unconditional theory of exchangeable Gibbs partitions, derive some additional results for the posterior partition, the conditional alpha diversity and a Stirling's approximation of the Gibbs weights.
13. Stirling's approximations for exchangeable Gibbs weights
arXiv:1206.6812v1 [math.PR] (2012)
We obtain some approximation results for the weights appearing in the exchangeable partition probability function identifying Gibbs partition models of parameter $\alpha \in (0,1)$, as introduced in Gnedin and Pitman (2006). We rely on approximation results for central and non-central generalized Stirling numbers and on known results for conditional and unconditional $\alpha$ diversity. We provide and application to an approximate Bayesian nonparametric estimation of discovery probability in species sampling problems under normalized inverse Gaussian priors.
14. Bayesian nonparametric estimation of Simpson's evenness index under alpha Gibbs priors
arXiv:1203.1666v1 [math.ST] (2012).
Explicit posterior predictive estimation of species richness has been obtained under priors belonging to the -Gibbs class (Gnedin & Pitman, 2006). Here we focus on posterior estimation of species evenness which accounts for diversity in terms of the proximity to the situation of uniform distribution of the population into different species. We focus on Simpson's index and provide a Bayesian estimator under quadratic loss function, with its variance, under some specific Gibbs priors.
15. Some further results for the two parameter Poisson-Dirichlet partition model
(2012). Proceedings of the XLVI Scientific Meeting of the Italian Statistical Soc. Rome, June, 2012.
We obtain some additional explicit results for the posterior partition generated
by sampling from the random atoms of a two-parameter Poisson-Dirichlet
model, conditional to a basic observed sample. Those results complement the large
amount of conditional and unconditional results already obtained for this model, and
have application in Bayesian nonparametric estimation in species sampling problems.
16. Conditional alpha-diversity for exchangeable Gibbs partitions driven by the stable subordinator
Proceedings of S.Co. 2011 - Padova, Italy. September 2011
Asymptotic behaviour of conditional $\alpha$ diversity for the two-parameter Poisson-Dirichlet partition model and for the normalized generalized Gamma model has been recently investigated in Favaro et al. (2009, 2011) with a view to possible applications in Bayesian treatment of species richness estimation. Here we generalize those results to the larger class of mixed Poisson-Kingman species sampling models driven by the stable subordinator (Pitman, 2003).
17. A decomposition approach to Bayesian nonparametric estimation for species richness under two-parameter Poisson-Dirichlet priors
Contributed Paper to ASMDA 2011, June, 7-10, Rome, Italy.arXiv:1002.0535v1 [math.PR] (2010).
We present an alternative approach to the Bayesian nonparametric analysis of conditional species richness under two-parameter Poisson Dirichlet priors. We rely on a known characterization by deletion of classes property and on results for Beta-Binomial distributions. Besides leading to simplified and much more direct proofs, our proposal provides a new scale mixture representation of the conditional asymptotic law.
18. A simple proof of a generalization of the Chu-Vandermonde identity
arXiv:1012.1243v1[math.PR](2010)
We provide a simple proof of a generalization of the multivariate Chu-Vandermonde identity recently derived in Favaro et al. (2010a). Exploiting known results for rising factorials and fourth Lauricella polynomials we show resorting to Laplace-type integral representation of the fourth Lauricella function may be avoided.
19. Reparametrizing the two-parameter Gnedin-Fisher partition model in a Bayesian perspective
Contributed Paper to the 58th World Statistics ISI Congress, Dublin, August 2011.
We introduce a new parametrization for the two-parameter species sampling model with finite but random number of different species recently introduced in Gnedin (2010a). We show the reparametrization yields a representation in terms of generalized Waring mixture of Fisher species sampling models and derive the structural distribution of the model.
20. Bayesian nonparametric analysis for a species sampling model with finitely many types
Contributed paper to SIS 2010 Meeting, Padua, 14-16 June,
We derive explicit Bayesian nonparametric analysis for a species sampling model with finitely many types of Gibbs form of type $\alpha= -1$ recently introduced in Gnedin (2009). Our results complement existing analysis under Gibbs priors of type $\alpha \in [0, 1)$ proposed in Lijoi et al. (2008). Calculations rely on a groups sequential construction of Gibbs partitions introduced in Cerquetti (2008).
21. A generalized sequential construction of exchangeable Gibbs partitions with application
Proceedings of S.Co. 2009, 14-16 September, Milano, Italy.
By resorting to sequential constructions of exchangeable random partitions (Pitman, 2006), and exploiting some known facts about generalized Stirling numbers, we derive a generalized Chinese restaurant process construction of exchangeable Gibbs partitions of type alpha(Gnedin and Pitman, 2006). Our construction represents the natural theoretical probabilistic framework in which to embed some recent results about a Bayesian nonparametric treatment of estimation problems arising in genetic experiment under Gibbs, species sampling, models priors.
22. On a Gibbs characterization of normalized generalized Gamma processes
Statistics & Probability Letters, 78, (2008) 3123-3128,
We show that a Gibbs characterization of normalized generalized Gamma processes, recently obtained in Lijoi, Pruenster and Walker (2007), can alternatively be derived by exploiting a characterization of exponentially tilted Poisson-Kingman models stated in Pitman (2003). We also provide a completion of this result investigating the existence of normalized random measures inducing exchangeable Gibbs partitions of type $\alpha \in (-\infty, 0]$.
23. A note on Bayesian nonparametric priors derived from exponentially tilted Poisson-Kingman models
Statistics & Probability Letters, 77, (2007) 1705-1711
We derive the class of normalized generalized Gamma processes from Poisson-Kingman models (Pitman, 2003) with tempered alfa-stable mixing distribution. Relying on this construction it can be shown that in Bayesian nonparametrics, results on quantities of statistical interest under those priors, like the analogous of the Blackwell-MacQueen prediction rules or the distribution of the number of distinct elements observed in a sample, arise as immediate consequences of Pitman's results.
24. A Poisson approximation for colored graphs under exchangeability
Sankhya, 68, 2, (2006) 183-197 (with S. Fortini)
We introduce random graphs with exchangeable hidden colours and prove an asymptotic result on the number of time a fixed graph appears as a subgraph of such random graph. In particular we give necessary and sufficient conditions for the number of subgraphs isomorphic to a given graph to converge, under a negligibility assumption on the frequencies of colours. Moreover we prove that the limiting law, when it exists, is a mixture of Poisson distributions.
25. Some Results on the Number of Coincidences under Exchangeability
Atti della XLII Riunione Scientifica SIS 2004, Bari, June, 9-11. (with S. Fortini)
Si presentano alcuni risultati sulla distribuzione asintotica del numero di coincidenze
in ambito bayesiano, cio`e qualora il meccanismo aleatorio che genera i dati sia
specificato mediante una distribuzione a priori. In particolare si determinano condizioni
necessarie e sufficienti per la convergenza in distribuzione del numero di coincidenze ad
una mistura di Poisson, nel caso di una generica distribuzione a priori, cio`e nella sola
ipotesi che i dati siano scambiabili. Risultati asintotici in questo ambito erano gi`a noti,
in letteratura, per particolari distribuzioni a priori. Le dimostrazioni si basano su recenti
risultati sui grafi aleatori colorati, con colori scambiabili.