Most learning algorithms assume that a data set is given initially. We address the common situation where data is not available initially, but can be obtained, at a cost. We focus on learning Bayesian belief networks (BNs) over discrete variables. As such BNs are models of probabilistic distributions, we consider the "generative" challenge of learning the parameters, for a fixed structure, that best match the true distribution. We focus on the budgeted learning setting, where there is a known fixed cost c_i for acquiring the value of the i-th feature for any specified instance, and a known total cost to spend acquiring all information. After formally defining this problem from a Bayesian perspective, we first consider allocation algorithms that must decide, before seeing any results, which features of which instances to probe. We show this is NP-hard, even if all variables are independent, then prove that the greedy allocation algorithm IGA is optimal when the costs are uniform and the features are independent, but can otherwise be sub-optimal. We then show that general (non-allocation) policies perform better, and explore the challenges of learning the parameters for general belief networks in this setting, describing conditions for when the obvious round-robin algorithm will, versus will not work optimally. We also explore the effectiveness of this and various other heuristic algorithms.