Using mobile agents for data aggregation in ad-hoc networks is a promising approach and gains more popularity every day. However, these agents need an efficient rout planning to optimize the quality of service (QoS) which is a very challenging task in such uncertain environment. Numerous previous works have presented different schemes for route planning of mobile agents in wireless sensor networks. Similarly, some other approaches have proposed the use of mobile agents for data aggregation in the Internet of Things (IoT). However, current approaches for route planning of mobile agents do not satisfy the requirements of the internet of things, due to the mobile and heterogeneous IoT nodes. In this paper, we propose an intelligent rout planning that enables mobile agents in IoT systems to make the best decision for selecting the next node in different moments. We use Markov Decision Process (MDP) as the underlying optimization model, which is well-known on its effectiveness to optimize decision making under uncertainty. In this model, we consider the distance between the nodes from each other, the distance between the nodes and the sink, residual energy of the nodes and the priority of them as the MDP parameters. Our proposed method could improve the energy consumption of IoT nodes and the life time of the system. Furthermore, our proposed method tries to maximize the reliability of the network and enhances data transmission delay.