<< Chapter < Page | Chapter >> Page > |
In the case of time series data, we can augment the exogenous covariates in the model with lagged values of the response variable, i.e. the observed counts at previous time points. Thus the model is “observation-driven.” Lags of exogenous covariates can also be included. For instance let the new covariate matrix be represented by Z , where
For more details see [link] .
Armed with the GLM model for Poisson regression, we can begin clustering the TSC. In order to determine the similarity or dissimilarity between two TSC, a metric is needed to measure the “distance.” The classic Euclidean metric is not adequate for data with time dependence. We will use the empirical Kullback-Leibler (KL) likelihood metric [link] , which calculates the distance between two TSC by evaluating the relative fit of their respective models.
Let λ _{j} be a given “model structure” for the data, i.e. an observation-driven Poisson model with specified covariates. The KL metric has the following expression.
where Y _{K} is the set of data objects which belong to cluster k . Note that $logp\left(y\right|{\lambda}_{k})$ is an expression for the likelihood of the model. See [link] for discussion on the likelihood of observation-driven Poisson models. The measure is made symmetric by,
With the KL metric, we apply a hierarchical bottom-up clustering algorithm. A flowchart of the algorithm is displayed in Figure 1.
The algorithm produces a cluster tree similar to the figure below. The bottom-up clustering method is easy to visualize and break down objects into groups and eliminates the need for any stopping criterion.
Though count data are prevalent in consumer behavior, obtaining commercial commercial data for MBC is expensive. Thus, for this project, we use results from previous studies on marketing data to creat a data set that realistically mimics consumer behavior.
Niraj et al. [link] proposed an economic model for consumer purchases of bacon and eggs. Based on store scanner data, the authors studied the consumer sensitivities to various variables such as personal utility, product prices, product displays, and purchase history. For the purpose of data simulation, key elements from this economic model were borrowed to create our own consumer bacon and eggs purchase data.
We let ${Y}_{b,t}$ and ${Y}_{e,t}$ be a bivariate Poisson random variable which represent a consumer's purchase of bacon and eggs during time window t respectively, then ${Y}_{b,t}$ and ${Y}_{e,t}$ can be modeled using a trivariate reduction [link] :
Notification Switch
Would you like to follow the 'The art of the pfug' conversation and receive update notifications?