Scientific peer review is pivotal to maintain quality standards for academic publication. The effectiveness of the process is currently being challenged by the rapid increase of paper submissions in various conferences. Those conferences need to recruit a large number of reviewers of different levels of expertise and background. The submitted reviews often do not meet the conformity standards of the conference. Such a situation poses an ever-bigger burden to meta-reviewers in making decisions. In this work, we propose a human-AI approach that estimates the conformity of reviews to the conference standards. Specifically, we ask peers to grade each other’s reviews anonymously on important criteria of review conformity such as clarity and consistency. We introduce a Bayesian framework that learns the conformity of reviews from the peer gradings and the historical reviews and decisions of a conference, while taking into account grading reliability. Our approach helps meta-reviewers easily identify reviews that require clarification and detect submissions requiring discussions while not inducing additional overhead from reviewers. Through a large-scale crowdsourced study where crowd workers are recruited as graders, we show that the proposed approach outperforms machine learning or review gradings alone and that it can be easily integrated into existing peer review systems.