Cloud computing provides an attractive computing paradigm in which computational resources are rented on-demand to users with zero capital and maintenance costs. Cloud providers offer different pricing options to meet computing requirements of a wide variety of applications. An attractive option for batch computing is spot-instances, which allows users to place bids for spare computing instances and rent them at a (often) substantially lower price compared to the fixed on-demand price. However, this raises three main challenges for users: how many instances to rent? what type (on-demand, spot, or both)? and what bid value to use for spot instances? In particular, renting on-demand risks high costs while renting spot instances risks job interruption and delayed completion, when the spot market price exceeds the bid. This paper introduces an online learning algorithm for resource allocation to address this fundamental tradeoff between computation cost and performance. Our algorithm dynamically adapts resource allocation by learning from its performance on prior job executions while incorporating history of spot prices and workload characteristics. We provide theoretical bounds on its performance and prove that the average regret of our approach (compared to the best policy in hindsight) vanishes to zero with time. Evaluation on traces from a large datacenter computing cluster shows that our algorithm outperforms greedy allocation heuristics and quickly converges to a small set of best performing policies. A key benefit of our approach is that it allows interpreting the allocation strategy of the output policies enabling users to apply them directly in practice.