Multi-label text classification refers to the problem of assigning each given document to its most relevant labels from the label set. Commonly, the metadata of the given documents and the hierarchy of the labels are available in real-world applications. However, most existing studies focus on only modeling the text information, with a few attempts to utilize a very small label hierarchy (up to several hundred labels). In this paper, we bridge the gap by formalizing the problem of metadata-aware text classification in a large hierarchy (orders of magnitude larger than previously used). To address this problem, we present the MATCH solution—an end-to-end framework that leverages both the document metadata and large-scale label hierarchy. To incorporate metadata, we pre-train the embeddings of text and metadata in the same space and also leverage the fully-connected attentions to capture the interrelations between them. To leverage the label hierarchy, we propose different ways to regularize the parameters and output probabilities of each child label by its parents. Extensive experiments on two massive text datasets with large-scale label hierarchies demonstrate the effectiveness of MATCH over state-of-the-art deep learning baselines.