A web marketing system with automatic pricing
Naoki Abe Tomonari Kamba
NEC C& C Media Res. Labs. and Human Media Res. Labs.
411 Miyazaki, Miyamaeku, Kawasaki 2168555 JAPAN
abe@ccm.cl.nec.co.jp, kamba@hml.cl.nec.co.jp
Abstract:
We propose a new scheme of `automatic pricing' for digital contents, and describe an implemented system as well as concrete pricing algorithms for it. Automatic pricing refers to a methodology of automatically setting sales prices to optimal prices, based on past prices and sales. In particular, we consider the case in which automatic pricing is done in order to maximize the profit of an online marketing site. We describe a demo site for online marketing with automatic pricing, which we call `digiprice.' We will also describe the concrete pricing algorithms we employ in digiprice, and report on preliminary performance evaluation experiments we conducted using simulated data. The results of experimentation verify that our methods are practical in terms of both the speed of convergence to the optimal price and computational efficiency.
Keywords: electronic commerce, online marketing, automatic pricing, machine learning, online learning
1. Introduction
With the recent technological and social developments surrounding the World Wide Web, it is fast developing as a medium for commerce as well as a medium for information exchange. Electronic commerce not only helps lubricate economical activities, but is bringing about a fundamental change to the practice of commerce that we have been familiar with over many centuries. Shopping sites on the Internet began at first as a mere copy of the offline shops that make sales with fixed prices, but soon auction sites such as ebay and priceline [5,10] became wellaccepted, and now `reverse auction' sites are gaining popularity. Such forms of commerce did exist before the emergence of the World Wide Web, but these online marketing sites are qualitatively different from their offline counterparts, due to the substantially greater reach they enjoy. These online sites are also attracting attention as a medium for community formation.
Electronic commerce is affecting what are being sold in addition to the way they are sold. Consumers can now purchase pictures, music and software as digital contents by downloading them from online sites. For manufacturers, digital contents allow massproduction with virtually no additional cost. Furthermore, electronic commerce is affecting the mechanism of pricing, one of the fundamental elements of commerce. Online auction and reverse auction sites, by their large audience size, are affecting the existing practices of pricing. The prices are more and more determined by the consumers, rather than the manufacturers. Conceptually, this property is shared by any capitalist economy, but the speed and the manner by which the consumers can affect pricing in electronic commerce make it qualitatively different.
In this paper, we propose a new scheme of `automatic pricing' for digital contents, and describe an implemented system as well as concrete pricing algorithms. Automatic pricing refers to a methodology of automatically setting sales prices to optimal prices, based on past prices and sales. In particular, we consider the case in which automatic pricing is done in order to maximize the profit of an online marketing site. In what follows, we begin by giving an outline of a demo site for online marketing with automatic pricing, which we call `digiprice.' We will then describe the concrete pricing algorithms we employ in digiprice, and report on performance evaluation experiments we conducted using simulated data.
2. Digiprice: A webmarketing system with automatic pricing
Digiprice is an online shopping server equipped with an automatic pricing function. Electronic payment function has not yet been implemented in the system and the site is operating internally on an intranet. It is therefore not freely available on the Internet. Figure 1 shows the overall system architecture. As shown in this figure, a seller is expected to specify conditions of sales for each item, such as the initial price and the minimum possible price, and the system automatically adjusts the sales price so as to maximize the sales revenue within those conditions. For this purpose, the system keeps the time of sales in addition to the quantity of sales in its database. This is because how the amount of sales per unit time changes as the price is adjusted plays a key role in automatic pricing. The system constantly calculates the current sales price for each item, based on the past records of prices and sales and the conditions specified by the seller.
Figure 1: System architecture of digiprice.
Figure 2 shows the top page of digiprice, and
Figure 2: Top page of digiprice.
Figure 3 gives a page for a seller to register an item for online sales, and Figure 4 shows an example page for a consumer to select an item, to purchase or to examine.
Figure 3: Seller's page of digiprice.
Figure 4: User's page of digiprice.
As shown in Figure 4, a consumer can check to see how the prices of a given content has changed in the past, and refer to comments made by other users. In particular, Figure 5 depicts what can happen, when the user clicks on the `check market' button on the page shown in Figure 4. The past prices of the specified good are plotted over time in the graph. The users can, for example, view this graph, try and predict what will happen to the price in the near future, and make their decision on whether and when to buy a given item.
Figure 5: `Check market' page of digiprice.
Figure 6 shows an example of a page that appears when a user has requested to purchase a certain item. Purchase of a good can mean different things depending on what is being bought. For example, if it is a digital picture that is being purchased, then it may mean downloading of a high quality digital picture file. If it is music, then it may mean downloading music data, which can then be enjoyed on a publically available software.
Figure 6: Purchase page of digiprice.
3. Automatic pricing methods employed by digiprice
The automatic pricing methods that we propose and that are implemented as part of `digiprice' are the following three.
 A pricing method that determines the price of each good independently, without using any attributes associated with them.
 A pricing method that determines the prices of all goods at once, as a function of their attributes.
 A couple of selection methods that determine which goods to display in order to optimize the tradeoff between selling more immediately and estimating better the pricing function. This is to be used in combination with the above attributebased method as the pricing method.
3.1 A pricing method for individual goods
With some exceptions, it is generally believed that the amount of commercial goods sold is a decreasing function of their prices. Let us denote the amount of sales per unit time of (a fixed good) by , at price p. Some goods are more sensitive to the price than others, and thus is unknown a priori to an online web marketing system. An especially desirable feature of online marketing is that it allows us to observe the amount of sales as a function of the price and optimize the price in an online fashion.
In general, profit per sales depends on the price as well as the amout of sales. In reality, this dependence is complicated due to factors such as economy to scale. Here, for simplicity, we suppose that the total profit can be approximated by the following function of price p and amount of sales , where we use to denote the production cost (per item) at price p and amount of sales N.
We further assume that the total cost does not depend on the amount of sales, allowing us to simplify the above to
This assumption is reasonable for digital contents.
Now the goal of an automatic pricing method is to find a price p that maximizes quickly, and set the price accordingly. That is, it wishes to find such that
We emphasize here that the goal is to quickly find , and not necessarily to estimate the entire function .
Below, we will describe a pricing method that estimates for each good independently, and sets the price automatically.
A pricing method based on stochastic approximation
Stochastic approximation is a general methodology for online function optimization, which estimates the maximum of a regression function by testing and obtaining the (estimate of) function values at points of its choice, and gradually converge to the optimum point. (c.f. [12].) In this section, we propose an automatic pricing method based on this general technique.
The basic idea of stochastic approximation is as follows. In trying to maximize , which dictates the expected reward at point x, it estimates the value of the derivative at its current point x, and determines the next point by updating the current position as . Since we cannot observe the values of directly, one need take care to ensure that the estimation of the derivative is credible. We basically follow the idea of the KieferWolfowitz method. Having set the initial point x to be some arbitrarily chosen point, this method repeats the following process.
 For step size , appropriately determined as a decreasing function of the number of trials (up to that point), test at two points and , and obtain rewards and , whose expectations are dictated by the function f.
 Update x as follows.
Here, in order to ensure that the x's so obtained converge to the (local) maximum point of f, certain constraints must be satisfied by the values of a and , as will be specified later.
Based on this general idea, we propose a pricing method for a single good. As indicated above, we assume that the profit function can be approximated as
For maximization of , C can be ignored, so we further simplify the above to obtain
In other words, we are simply to maximize the total revenue.
There can be legal constraints on pricing (against dumping, for example), so we assume that there are maximum and minimum possible prices, and , given a priori to the system. We assume also that the system is given a reasonable price as the initial price. Based on these minimal pieces of information, our first pricing method (StochPrice:Stochastic Pricing) repeats the following process.
 Using step size , determined as a decreasing function of the number of trials so far (for example, , where I is the current trial number), conduct online sales at both prices and for a certain fixed period of time, and based on the amount of sales obtained during these periods, and , calculate the profits (revenues) obtained for the respective prices.
 Update the current price p as follows.
Here, A is an update interval, set as a decreasing function of the trial number (for example ), and T is a measure of the duration of each trial. (To be precise, the size of T depends strongly on what time unit is used. In our experiments, we actually used the average number of visits to the site in a unit period as the value of T.)  If p is either above or below , then clamp its value so that it falls within these bounds.
In order to ensure convergence of the obtained prices to the (local) optimum, it suffices to see that the following conditions hold on A and described above [12]:
Note that both of these conditions are met by the examples we gave earlier of and .
We give the details of this method as a pseudocode below.
3.2 Attributebased automatic pricing method
The pricing method proposed in the previous section suffers from the shortcoming that each new product will basically have to be priced from scratch. Using attributes associated with good and users, it is supposed that earlier experiences can be used to price new goods more intelligently. In this section, we propose such a method.
Let us write X for a (binary) attribute vector associated with a fixed good, and write for its ith component. These attributes can be attributes of the good, such as its category, those of a user, such as their demographic attributes, and their combinations. For example, based on attributes like iff cosmetics, iff woman, combined attributes like can be obtained. Strictly speaking, therefore, these attribute are not attributes of the goods alone, but are attributes of potential `sales,' consisting of a user, a good and possibly the environment. In digiprice, we use binary attributes of the contents, those of the users, and their logical combinations (conjunctions).
The basic idea behind the pricing method we propose here is that the optimal price of a good can be approximated by a linear function of its attributes. That is, there is a weight vector W having the same dimensionality as their associated attribute vectors, such that for an arbitrary good, its profit function approximately attains maximum at , where X is its associated attribute vector, i.e.
Here, we emphasize that this assumption is much weaker than an assumption that itself is approximable by a simple function. In general, it is supposed that will take a complicated form, but it is reasonable to suppose that its optimal price can be linearly approximated. Since the goal of an automatic pricing method is not necessarily to estimate but that of , this assumption helps us design a simple method for it.
Automatic pricing method based on linear approximation of optimal price
The pricing method we propose here is based on the KieferWolfowitz method as before, but here the search for optimality is done in a higher dimensional attribute space.
 For each good, calculate its associated vector based on its attributes and those of the current user.
 For each good i, set the initial price to , using the weight vector W.
 For each good i, generate a random vector of length . Here, is a step size, determined as a decreasing function of the trial number I (for example, .)
 Using the vector thus obtained, set the current price for each good i as follows. Here, for each good, if goes above the maximum price or below the minimum price, then clamp the vector so that this is not so.
 Conduct online sales at the above price for a certain period.
 Next, conduct online sales for the same period, with the price set as follows. Here, too, should be clamped if necessary.
 For each good, for the respective prices, calculate the total profits, based on the amount of sales obtained for each of the above periods, and .
 Once for each i, update the weight W using the value of , as follows.
The details of this method, FeaturePrice(Featurebased Pricing) are shown as a pseudocode below.
3.3 Methods for selecting goods to display
We have so far discussed how to automatically set the prices of goods for online marketing, in order to maximize the resulting revenue. When the number of items to be sold at a particular site is large enough, however, there is the additional issue of which items to `display on the show window.' Even if, in principle, the number of items that can be displayed on an online marketing site is unlimited, the degree of exposure to the user is heavily influenced by whether they are displayed on the top page, etc. In this section, we consider how to optimize both the prices and the selection of items to display, in order to maximize the total revenue obtained at an online marketing site.
At the heart of the issue raised here, is the tradeoff known as the `ExplorationExploitation tradeoff' in the literature on online learning (reinforcement learning in particular [7]), formulated below for the current problem of our concern.
 If one wishes to maximize the immediate revenue, one should select those items estimated to have the maximum expected revenue.
 If one wishes to maximize the total revenue obtained in the long run, one should also take care to display a variety of items, so that their optimal prices, and hence their expected maximum revenues, can be reliably estimated.
The issue of the ExplorationExploitation tradeoff has been addressed by a number of authors in the literature [3,6,2,8,1], but the work in [8,1], in particular, does so in the context of online learning of (probabilistic) linear functions, and is closely related to the problem studied here.
In trying to resolve this tradeoff, we consider the following three measures.
 The immediate payoff of each item
In particular, we use the profit obtained during the last sales period (2T) for each item, i.e.  Variety of attribute vectors
We use the sum of Hamming distance among the vectors in the set of items selected to be displayed, i.e.  Uncertainty of estimating the optimal price function
We use the difference between the revenues obtained in the first half and the second half of the last sales period:
By combining (1) and (2), and (1) and (3) of these three measures, we propose two methods for selecting the items to display.
 Variety Selection
From among the set of candidate items, select a set of items such that the weighted summ of the variety measure and the immediate payoff measure is maximized, that is, select S such that Here, and are parameters controlling the relative contributions of the two measures.  Uncertainty Selection
From among the set of candidate items, select the top N items, maximizing the sum of the immediate payoff and the uncertainty measure: Thus, this method coincides with that of maximizing the larger of the revenues raised for the two different prices.
Since Variety Selection is a strategy that is more oriented towards exploration, we generally switch from Variety Selection to the method of selecting those expected to bring most immediate revenues, after an initial `exploration' phase. For Variety Selection, strict maximization of would be computationally infeasible. Thus, we settle with the following heuristic: We start with the set consisting with the top N items in terms of the immediate payoff, and then repeatly perform random swaps, if such a swap results in increasing the value of the above objective function.
We refer to the resulting online marketing strategy, consisting of good pricing and selection, by such names as StochPrice (Uncertainty) and FeaturePrice (Variety).
4. Performance Evaluation
4.1 Experimental Procedure
Consumption model
We model the consumption behavior of users by the following stochastic process.
At each time t and price p, do the following.
 Determine the number of visits in a unit time interval to the marketing site in question, according to a normal distribution with a fixed average and variance.
 Determine the probability of purchase by the following function of time t since the beginning of sales and price p where the exact forms of and will each take one of three possible forms (to be described shortly). The idea is that the purchase probability is governed by a time dependent factor and a price dependent factor, which are relatively independent of one another.
 Determine the number of sales in a given time period T by
We considered the following three forms for .
 is constant.
 decreases exponentially in time.
 is normally distributed.
For , we assume one of the following three forms.
 is constant.
 decreases exponentially in price.
 is a logistic function (smooth step function).
For most of the experimental results we report on, we used the first choice for and the last choice for .
Generation of attributes
The attributes associated with goods are generated by either of the following two methods: the independent method, and the random walk method. In the independent method, we generate each attribute vector by randomly generating a bit (0 or 1) by a fair coin. In the random walk method, we first generate a random attribute vector as above. We then generate the next vector by probabilistically flipping each bit independently, with a small fixed probability. This process is repeated until the desired number of vectors are obtained.
We then use these attributes to determine the consumption model for each good, so that goods having similar attribute vectors tend to have similar consumption patterns. More concretely, we determine each of the constants (such as C, K, c) determining and as the product of the attribute vector and a a real valued vector, which is randomly generated for each run.
Site visits
The number of site visits is determined independently for each good (by a normal distribution), when there is no issue of selecting goods to display. When a subset of goods is to be selected for display, the number of site visits for the selected items, and those for the items not selected, are determined by two different normal distributions. We ensure that the distribution of for the selected goods has a much higher mean than for the nonselected ones.
4.2 Experimental results
For all cases we consider, we basically used the total (cumulative) revenues obtained, for a fixed good or an ensemble of goods, as the performance measure for our pricing methods. In most cases. we compared the performance of the proposed pricing methods against that of the method of just keeping the initial price, and the ideal method of using a near optimal price from the beginning. Since the `optimal price' was not available in closed form for most of our experiments, in our plots, we substituted the optimal price by the estimated optimal price, as obtained by our pricing methods. The performance plots are averaged over five randomized runs, unless otherwise noted.
Results on StochPrice
Figure 7 plot the cumulative and instantaneous (per unit sales period) revenues obtained by the following three methods as a function of the number of trials: (i) keeping the initial price; (ii) using StochPrice; and (iii) using a nearoptimal price from the beginning. In this experiment, various parameters of StochPrice were set as follows. The constants within F and G, which determine the purchase probability, were set as , , and c=20. This means that the purchase probability is at most 0.5 and is around 0.1 for prices that are two units higher than the optimal price (which is around 16). The number of site visits per unit time, , was set to be distributed around 300 visits. Thus, at each unit sales period, a typical number of sales is in the tens.
Figure 7: (Top) Cumulative revenues and (Bottom) instantaneous revenues obtained by (i) keeping the initial price; (ii) using StochPrice; and (iii) using a nearoptimal price from the beginning.
Note that the third method is an ideal and impractical method, since the near optimal price is not available a priori. It is clearly seen in these graphs that after suffering a small initial loss due to the necessity to learn the optimal price, StochPrice quickly catches up with the ideal method in its performance.
Figure 8 shows how the price of a particular item changes over time (on a particular run) using StochPrice, for a variety of choices of the initial price. It is seen that the optimal price is approximately 16 dollars, and whatever the initial price is (within the range shown here), they converge to the optimal price.
Figure 8: Price changes made by StochPrice over 1,000 trials as a function of the initial price.
Figure 9 plots how the cumulative revenue obtained by StochPrice over 1,000 trials changes as a function of the initial price. In the graph, this is compared with how the revenue changes if one kept the initial price, also as a function of the initial price. A rather dramatic difference is observed: When the initial price is kept, the total revenue quickly falls off when the initial price is wrongly set, but using StochPrice, one can see that the total revenue is insensitive to the choice of the initial price for a good range of initial prices.
Figure 9: Cumulative revenues obtained by StochPrice and by keeping the initial price, as a function of the initial price.
Results on FeaturePrice
As before, we plot the cumulative and instantaneous (per unit sales period) revenues obtained by the following methods: (i) keeping the initial price, (ii) using FeaturePrice, and (iii) using a near optimal price from the beginning. (See Figure 10.) In this experiment, the number of goods was set to be 30, and the number of attributes was 10. Other parameters controling F and G were set as follows: , , and . was set to be averaged at 300 as before.
It is clearly seen that FeaturePrice also improves the revenue of a marketing site significantly. Whether the prices computed by FeaturePrice converge to the respective optimal price is harder to see, but we plot how the prices are changed by FeaturePrice: Figure 11 exhibits how the prices are changed by FeaturePrice, for 10 (chosen unintentionally) out of the 30 items used in the above experiment. One can see here that although the pricing is done via learning of a single target vector, a great variety of pricing is realized by utilizing the attributes attached to the items being sold. We plan in the near future to compare the performance of FeaturePrice with that of StochPrice in realistic settings, which we will report on in a full paper.
Figure 10: (Top) Cumulative revenues and (Bottom) instantaneous revenues obtained by (i) keeping the initial price; (ii) using FeaturePrice; and (iii) using a nearoptimal price from the beginning.
Figure 11: Price changes made by FeaturePrice for 10 out of the 30 items, over 1,000 trials.
Results on Good Selection Methods
We compared the revenues obtained by the two methods we propose to address the ExplorationExploitation tradeoff, namely Uncertainty Selection and Variety Selection, with the method of always selecting those goods that are estimated to bring about the maximum revenue (MaxProfitSelection). Here, the exploration period for Variety Selection was set at 200 out of the 1,000 trials in total. The parameters controling F and G and were set identically to the experiment for FeaturePrice.
Figure 12 shows the cumulative and instantaneous revenues obtained by each of the three methods. It is seen that the revenues obtained by both Uncertainty Selection and Variety Selection significantly outperform that of MaxProfit, with Uncertainty Selection being the favored for the experimental conditions we tried. Since Uncertainty Selection is computationally cheaper than Variety Selection, it appears to be the method of choice.
Figure 12:(Top) Cumulative revenues and (Bottom) instantaneous revenues obtained by VarietySelection, UncertaintySelection and MaxProfitSelection.
5. Concluding Remarks
We have proposed a new concept of online marketing with automatic pricing. In particular, we exhibited an implemented system and concrete pricing algorithms designed to maximize the profit of the sellers. The proposed scheme of automatic pricing is distinguished from other forms of dynamic pricing that are currently employed by online marketing sites on the Internet. Most webmarketing sites with dynamic pricing schemes base their pricing on the demandsupply information that is available via the internet. Auction [5,10] and reverse auction sites [9], as well as other pricing information services [4] share the property that the pricing is determined as a result of competition between buyers and sellers. Among those webmarketing sites that are currently operating on the Internet, perhaps `outletZoo' [11] employs a pricing scheme that is most closely related to ours. In particular, OutletZoo offers a dynamic pricing scheme they call `Automatic Drop.' It is a means to ensure that excess merchandises that manufacturers wish to sell are eventually all sold, and it works by simply making `prices fall at a seller determined percentage at regular intervals until everything is sold.' A variation of our automatic pricing scheme which is sensitive to constraints posed by the amount of goods in stock may prove to be a viable alternative. In the future, we wish to investigate extensions of the proposed pricing scheme to other scenarios and purposes, such as in a twoway interactive marketing environment like auction or in the presence of physical stock constraints. We believe that, through such extensions and variations, the concept of automatic pricing we propose in this paper can be developed into a model of online marketing that can truly benefit the buyers as well as the sellers.
References
 1
 N. Abe and P. Long. Associative reinforcement learning using linear probabilistic concepts, Proceedings of the 16th International Conference on Machine learning, pp.311, 1999.
 2
 N. Abe and J. Takeuchi, The `lobpass' problem and an online learning model of rational choice, Proceedings of the Sixth Annual ACM Conference on Computational Learning Theory, pp.422428, 1993. pp. 11981204, 1994.
 3
 D. A. Berry and B. Fristedt, Bandit Problems, Chapman and Hall, New York, 1985.
 4
 Dealtime, An Internet marketing service site (http://dealtime.com/).
 5
 Ebay, An Internet auction site (http://www.ebay.com/).
 6
 D. P. Helmbold and N. Littlestone and P. M. Long, Apple Tasting and Nearly OneSided Learning, Proceedings of the 33rd Annual Symposium on the Foundations of Computer Science, 1992.
 7
 L. Kaelbling, Associative Reinforcement Learning: Functions in kDNF, Machine Learning, 15(3), pp. 279298, 1994.
 8
 P. M. Long, Online evaluation and prediction using linear functions, Proceedings of the 10th Annual Conference on Computational Learning Theory, pp. 2131, 1997.
 9
 Nextag, An Internet reverse auction site (http://nextag.com/).
 10
 Princeline, An Internet auction site (http://www.priceline.com/).
 11
 OutletZoo, An Internet marketing site (http://outletzoo.com/).
 12
 M. T. Wasan, Stochastic Approximation, Cambridge University Press 1969.
Acknowledgements
We would like to thank Phil Long of National University of Singapore for valuable discussions on topics related to this paper. We also thank the anonymous referees for providing invaluable information. We wish to thank Dr. Doi, Dr. Koseki, Dr. Koike, Dr. Sakata, and Dr. Goto of NEC for their encouragement for this research. Finally, we thank Mr. Omoto and Dr. Takada of NIS for their programing efforts.
Vitae
Naoki Abe received the B.S. and M.S. degrees from Massachusetts Institute of Technology in 1984, and the Ph. D. degree from the University of Pennsylvania in 1989, all in Computer Science. After holding a post doctoral researcher position at the University of California, Santa Cruz, he joined NEC Corporation in 1990, where he is currently principal researcher in the C & C Media Research Laboratories. He is also visiting associate professor in the department of computational intelligence and systems science of Tokyo Institute of Technology. His research interests include theories and applications of machine learning to various domains, including internet information mining and navigation. 

Tomonari Kamba received his B.E. and M.E.,and Ph.D. in Electronics from the University of Tokyo in 1984, 1986 and 1997, respectively. He joined NEC Corporation in 1986, and he has been engaged in user interface design methodology, multimedia user interface, software agent, and Internet information service technology. He was a visiting scientist at the Graphics, Visualization and Usability Center at the College of Computing, Georgia Institute of Technology from 1994 to 1995. He is now research manager in the Human Media Research Laboratories. Dr. Kamba is a member of ACM SIGCHI and the Information Processing Society of Japan. 