In this paper, the uniform price and discriminative price method are compared in the carbon auction market using multi-agent Q-learning. The government and different firms are considered as agents. The government as auctioneer allocates initial permits in the carbon auction market, and the firms as bidders compete with each other to obtain a larger share of auction. The carbon trading market, penalty, reserve price, and bidding volume limitation are considered. The simulation analysis demonstrates that bidders have different behavior in two pricing methods under different amounts of carbon permits. In the uniform price, the value of bidding volume, firms’ profit, and trading volume for low permits and the value of the government revenue, clearing price, the trading price and auction efficiency for high permits are greater than ones in the discriminative price method. Bidding prices have a higher dispersion in the uniform price than the discriminative price method for different amounts of carbon permits.