Slate recommendation generates a list of items as a whole instead of ranks each item individually, so as to better model the intra-list positional biases and item relations. In order to deal with the enormous combinatorial space of slates, recent work considers slate recommendation as a generation task so that a slate can be directly compressed and reconstructed. However, we observe that such generative approaches—despite their proved effectiveness in computer vision—suffer from a reconstruction-concentration trade-off dilemma in recommender systems: when focusing on reconstruction, they easily over-fit the training data and hardly generate satisfactory recommendations; on the other hand, when focusing on satisfying the user interests, they get trapped in a few items and fail to cover the item variation in slates. In this paper, we propose to enhance the accuracy-based evaluation with slate variation metrics to estimate the stochastic behavior of generative models. We then illustrate that instead of reaching to one of the two undesirable extreme cases in the dilemma, a valid generative slate recommendation model can be observed in a narrow “elbow” region in between. And we show that item perturbation can enforce slate variation and mitigate the over-concentration of generated slates, which expand the “elbow” performance to an easy-to-find region. We further propose to separate a pivot selection phase from the generation process so that the model can apply perturbation before generation. Empirical results show that this simple modification can provide even better variance with the same level of accuracy compared to post-generation perturbation methods.