This paper compares two solutions for human-like perception using two different modular “plug-and-play” frameworks, CAVIAR (List et al, 2005) and Psyclone (Thórisson et al, 2004, 2005a). Each uses a central point of configuration and requires the modules to be autodescriptive, auto-critical and auto-regulative (Crowley and Reignier, 2003) for fully autonomous configuration of processing and dataflow. This allows new modules to be added to or removed from the system with minimal reconfiguration. CAVIAR uses a centralised global controller (Bins et al, 2005) whereas Psyclone supports a fully distributed control architecture. We implemented a computer vision-based human behaviour tracker for public scenes in the two frameworks. CAVIAR’s global controller uses offline learned knowledge to regulate module parameters and select between competing results whereas in Psyclone dynamic multi-level control modules adjust parameters, data and process flow. Each framework results in two very different solutions to control issues such as dataflow regulation and module substitution. However, we found that both frameworks allow easy incremental development of modular architectures with increasingly complex functionality. Their main differences lie in runtime efficiency and module interface semantics.