Device-to-device (D2D) communications have been regarded as a promising technology to meet the dramatically increasing video data demand in the 5G network. In this paper, we consider the power control problem in a multi-user video transmission system. Due to the non-convex nature of the optimization problem, it is challenging to obtain an optimal strategy. In addition, many existing solutions require instantaneous channel state information (CSI) for each link, which is hard to obtain in resource-limited wireless networks. We developed a multi-agent deep reinforcement learning-based power control method, where each agent adaptively controls its transmit power based on the observed local states. The proposed method aims to maximize the average quality of received videos of all users while satisfying the quality requirement of each user. After off-line training, the method can be distributedly implemented such that all the users can achieve their target state from any initial state. Compared with conventional optimization based approach, the proposed method is model-free, does not require CSI, and is scalable to large networks.