We describe our experience developing a flexible architecture for collecting interaction data for a medical cognitive tutoring system. The architecture encompasses (1) an agent-based communications system for passing messages between all tutor components, as well as capturing and storing them in relational format, (2) a schema for managing all system data from low-level interface events to student-tutor interaction and experimental variables, and (3) an interface for querying and retrieving this data. The system has been in use over the past year, and includes data from one large study and several smaller studies. We discuss some of the lessons we have learned over the past year as we strive to achieve a scalable and maintainable system to support educational data mining in our domain. We also argue that a standards based approach to messaging could facilitate development of shared data sets, and especially shared analytic services for the next generation of tutoring systems.