2018
Cross-Project Code Clones in GitHub Empirical Software Engineering, 2018
Within-Ecosystem Issue Linking: A Large-Scale Study of Rails” Proceedings of the 7th ACM/IEEE International Workshop on Software Mining
On the Naturalness of Proofs Proceedings ESEC/FSE (NIER Track) 2018
Whom Are You Going to Call?: Determinants of @-Mentions in GitHub Discussions arXiv:1806.08457, 2018
One Size Does Not Fit All: An Empirical Study of Containerized Continuous Deployment Workflows, FSE 2018
A Survey of Machine Learning for Big Code and Naturalness ACM Computing Surveys, 51(4) 2018 pdf
Mining Semantic Loop Idioms IEEE Transactions on Software Engineering 2018 44(7) pdf
Modern Food Foraging Patterns: Geography and Cuisine Choices of Restaurant Patrons on Yelp IEEE Transactions on Computational Social Systems, vol. 5, no. 2, pp. 508-517, June 2018. doi: 10.1109/TCSS.2018.281965
A Clustering-based Approach for Mining Dockerfile Evolutionary Trajectories SCIS
Determinants of quality, latency, and amount of Stack Overflow answers about recent Android APIs PLOS One, March 2018, https://doi.org/10.1371/journal.pone.0194139
2017
Are Deep Neural Networks the Best Choice for Modeling Source Code? ESEC/FSE 2017 pdf
Recovering Clear, Natural Identifiers from Obfuscated JavaScript Names. ESEC/FSE 2017 pdf
Some From Here, Some From There: Cross-Project Code Reuse in GitHub MSR 2017 pdf ACM Distinguished Paper Award
A Large Scale Study of Programming Languages and Code Quality in Github CACM, 60 (10), p.91-100, 2017
CACM also ran a Technical Perspective article about our article. It is available here.
How Do Software Engineering Practices Change Following Adoption of Continuous Integration? 32nd IEEE/ACM International Conference on Automated Software Engineering, 2017
Perceived Language Complexity in GitHub Issue Discussions and Their Effect on Issue Resolution 32nd IEEE/ACM International Conference on Automated Software Engineering, 2017
Social synchrony on complex networks IEEE Transactions on Cybernetics, PP(99), p.1-12, 2017
2016
Tracing distributed collaborative development in apache software foundation projects Empirical Software Engineering, 2016, doi:10.1007/s10664-016-9463-3
Initial and Eventual Software Quality Relating to Continuous Integration in GitHub arXiv:1606.00521, 2016
Converging Work-Talk Patterns in Online Task-Oriented Communities PLOS One 11(5): e0154324. doi:10.1371/journal.pone.0154324, 2016
Stochastic Actor-Oriented Modeling for Studying Homophily and Social Infuence in OSS Projects, (accepted), ESE 2016 pdf
Belief and Evidence in Empirical Software Engineering, (accepted), ICSE 2016 pdf
On the “Naturalness” of buggy code (accepted), ICSE 2016 pdf
The Sky is Not the Limit: Multitasking on GitHub Projects (accepted), ICSE 2016
2015
On the “Naturalness” of buggy code
Unpublished, on ArXiv
Developer Migration in the GitHub Ecosystem ESEC/FSE 2015
Quality and Productivity Outcomes Relating to Continuous Integration in GitHub ESEC/FSE 2015
CACHECA: A Cache Language Model Based Code Suggestion Tool. ICSE 2015 Demonstration Track pdf
Gender and Tenure Diversity in Github Teams. CHI 2015 pdf
Assert Use in GitHub Projects. ICSE 2015 pdf
Wait For It: Determinants of Pull Request Evaluation Latency on GitHub MSR 2015 pdf
Will they like this? Evaluating Code Contributions With Language Models MSR 2015 pdf
New Initiative: Naturalness of Software. ICSE 2015 NIER Track pdf Winner, Best Paper Award
A Large Scale Study of Programming Languages and Code Quality in Github. The Version currently in ACM DL has been updated, see this pdf version for errata/details. We have requested an update to the ACM DL version.
2014
On the Localness of Software FSE 2014 pdf
The Plastic Surgery Hypothesis FSE 2014 pdf
Panning Requirement Nuggets in Stream of Software Maintenance Tickets FSE 2014
Focus-Shifting Patterns of OSS Developers and Their Congruence with Call Graphs FSE 2014 pdf
A Large Scale Study of Programming Languages and Code Quality in Github” FSE 2014 pdf
Comparing Static bug finders and Statistical defect prediction ICSE 2014 pdf DATA
How Social Q&A Sites are Changing Knowledge Sharing in Open Source Software Communities CSCW 2014
2013
Using and Asking: APIs Used in the Android Market and Asked About in StackOverflow SocInfo2013 pdf
Sample Size vs. Bias in Defect Prediction. ESEC/FSE 2013 pdf
Asking for (and about) Permissions Used by Android Apps MSR 2013
Dual Ecological Measures of Focus in Software Development. ICSE 2013 pdf Winner, ACM SIGSOFT Distinguished Paper Award
How, and Why Process Metrics are Better. ICSE 2013 pdf
2012
To what extent could we detect field defects? ASE 2012 pdf
When Would This Bug Get Reported? ICSM 2012 pdf
MIC Check: A Correlation Tactic for ESE Data, MSR 2012 pdf
On the “Naturalness” of software, Appeared in ICSE 2012 pdf (Expanded Version!)
Recalling the Imprecision of Cross-Project Defect Prediction> Appeared in FSE 2012 pdf
Cohesive and Isolated Development with Branches, Appeared in FASE 2012
Clones: what is that smell?Accepted to Springer-Verlag International Journal on Empirical Software Engineering. pdf
2011
Got Issues? Do New Features and Code Improvements Affect Defects? WCRE 11 pdf
Ecological Inference in Empirical Software Engineering. ASE 2011 pdf Winner, ACM SIGSOFT Distinguished Paper and ASE 2011 Best Paper Awards.
BugCache for Inspections : Hit or Miss? SIGSOFT FSE 2011 pdf
Don’t Touch My Code! Examining the Effects of Ownership on Software Quality. SIGSOFT FSE 2011 pdf
A Simpler model of software readability. MSR 2011 pdf
Operating System Compatibility Analysis of Eclipse and Netbeans Based on Bug Data. MSR 2011 Mining Challenge
Ownership, Experience and Defects: a fine-grained study of Authorship. ICSE 2011 pdf
An Empirical Study on the Influence of Pattern Roles on Change-Proneness accepted to Empirical Software Engineering Journal Springer-Verlag, 2011. pdf
2010
The missing links: bugs and bug-fix commits. SIGSOFT FSE 2010 pdf
Validity of Network Analyses in Open Source Projects. MSR 2010 pdf
Clones: What is that Smell? MSR 2010 pdf Winner, Best Paper Award, MSR 2010
Thex: Mining Metapatterns in Java. MSR 2010 pdf
2009
Putting it All Together: Using Socio-Technical Networks to Predict Failures ISSRE 2009.
Fair and Balanced? Bias in bug-fix Datasets” SIGSOFT FSE 2009 pdf
Promises and Perils of Mining Git MSR 2009 pdf
Modeling and verifying a broad array of network properties Europhysics Letters (EPL), 2009 pdf
Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista, ICSE 2009 pdf Winner, ACM SIGSOFT Distinguished paper award