One of the things that I have been doing over the past several months has been to write up how Magento works on the inside. These have primarily been for my own benefit, but could probably equate a small book by now. I was doing this mostly to make sure that what I thought I understood about Magento was actually right and so I intentionally have been doing a pretty deep dive on Magento’s internal workings.
It has been tremendously useful exercise for me and so I wanted to share it. The first one was about configuration, which is really one of the important linchpins of the system. EVERYTHING depends on the configuration to some degree. On a typical setup a compiled configuration object in the cache is well north of 300k of XML.
This is a long write-up. That’s because it’s pretty complete. Yes, I’m sure I missed some things but this will probably teach you more about Magento configuration than you ever wanted to know. Also, it is a little on the raw side, so I beg your forgiveness on that.
Here we go.
Order of operations
-
Load local.xml
-
Load module XMLs from etc/modules
-
Load individual config.xml files for modules
-
Reload local.xml to make sure its values have not been overridden
-
Load configuration from the DB
-
Merge default scope values into config
-
Copy default values into websites nodes
-
Copy website values into store nodes
-
Cache configuration and config sections in cache
Description of operations
Configuration starts in Mage::run() when Mage::_setConfigModel is called. This method allows a developer to use a class other than the default Mage_Core_Model_Config instance. However the class name that a developer must extend Mage_Core_Model_Config and be specified in the config_model option. Some reasons for using this could include getting configuration from alternate data sources such as a web service or perhaps an LDAP server, but generally the default would be sufficient.
When Mage_Core_Model_Config is created it sets its own cache ID as “config_global”. It also creates an object called Mage_Core_Config_Options which is used for creating the base options for the application such as the location of the etc, code and design directories. Additionally it creates an instance of Mage_Core_Model_Config_Base which is used to clone configuration nodes to help with merging XML configuration files.
The parent of the Mage_Core_Model_Config class is Mage_Core_Model_Config_Base and its parent constructor is called. In the constructor the class Mage_Core_Model_Config_Element is set as the elementClass. Because it is set here it does not seem like it can be overridden using the base configuration object. If you want to have a different element class for some unknown reason you will need to do that by creating a new configuration base class and change it that way. In other words, the element class is not configurable, and you’ll probably never want it to be. But if you really want to you can pass the name of the config module in the Mage::run(”,”,$options) $options array as an array key ‘config_model’. But, again, very seldom would you really want to do that.
The parent of Mage_Core_Model_Config_Base is Varien_Simplexml_Config. This is the most basic class for handling configuration and since Mage::_setConfigModel requires that a configuration object extend Mage_Core_Config_Base, a child class of Varien_Simplexml_Config, this class will be loaded for any configuration activity that needs to be completed.
The constructor of Varien_Simplexml_Config accepts a sourceData parameter. This parameter can be null, contain a filename or an XML string. It can also be an instance of of Varien_Simplexml_Element which is passed into setXml() and will overwrite any previously calculated configuration options.
If the option passed is a file name then loadFile() will be called. The file data is loaded via file_get_contents() after which processFileData() is called. This method does not do anything in the base configuration object but could theoretically be used to massage data, perhaps like having a JSON configuration file which would be parsed and returned as an XML string. Or perhaps there could be a call to a third party web service which could provide additional configuration options to be merged based on the configuration text. However, if you are passing it to Mage_Core_Model_Config::__construct() you will run into a problem with anything but an array. This is because Mage_Core_Model_Config creates a Mage_Core_Model_Config_Options class internally for some basic system options, like file locations. This class does not understand anything but arrays and so you end up with a misconfigured system if you pass in a string. So, stick to arrays.
Once the file has been loaded or if the sourceData parameter was XML, the loadString() method is called. This method takes the rawXML string as a parameter and passes it to simplexml_load_string() along with _elementClass. This will cause the SimpleXML extension to parse the XML document into elements of the type specified. After it has been parsed, Varien_Simplexml_Config checks to make sure that the element is an instance of Varien_Simplexml_Element. This will be to validate that the XML is actually parsable and that any element class that is specified has extended the *_Element class.
At the end of this process the configuration object is stored in the static variable Mage::$_config and is directly accessible via the Mage::getConfig() method.
After Mage::_setConfigModel() is run the previously created instance of Mage_Core_Model_App is referenced and its run() method is called. In a default scenario this will be the first time that the configuration is actually loaded. The run() method calls another method called baseInit(). The run() method takes an array of options that can be passed into baseInit() but in the default scenario this is not done.
Mage_Core_Model_App::baseInit() retrieves the configuration object from Mage::getConfig() and stores it locally in the object. If there are options provided in the bootstrap this is where they are added to the configuration object. The baseInit() method calls the loadBase() method which retrieves the etc directory location from the configuration options which were set earlier and iterates over any .xml files in the directory and merges them into $this. During the iteration the _prototype instance is cloned and the clone is used to load the XML file. The resulting object is then merged into the current configuration node.
That merging is accomplished by calling the extend() method. This method retrieves the local instance of the Varien_Simplexml_Element, or the specified child node. This node’s extend() method is called which takes the to-be-merged element’s child nodes and adds each to itself. This is done by iterating over the nodes and explicitly adding each node and its attributes. This extend() method is a feature that is not part of the main PHP XML handling functionality. If you try to replicate this functionality on your own it will take a lot of time and code. Just steal Varien_Simplexml_*. 🙂
Once all of the files have been loaded the Mage_Core_Model_Config object checks to see if local.xml has been loaded and marks a flag.
After Mage_Core_Model_Config::_initBaseConfig() has finished the cache is loaded in _initCache(). Cache options can be provided with an option key of ‘cache’. Any options that are provided will overwrite configuration values. The cache node is ‘global/cache’ and is hard coded into the application.
Mage_Core_Model_Config::getNode() takes three parameters, all of which are optional, to allow the developer to retrieve the most pertinent configuration node. The parameters are path, scope and scope code.
The scope and scope code allow developers to retrieve like content from different “contexts” such as stores or websites. Scope for stores and websites can be provided as plural or singular values. If a scope is provided that is not default, store or website an exception will be thrown.
If a valid path is provided an object of type Mage_Core_Model_Config_Element is returned. Because this object ultimately extends SimpleXMLElement it can be acted upon like any SimpleXML element.
Up until this point none of the module configurations have been loaded. The module config may be cached whereas the app/etc/*.xml configuration is not. This caching is determined by the settings for _allowedCacheOptions in Mage_Core_Model_Cache. This is set by querying the core_cache_option table which states which areas can be cached (do SELECT * FROM core_cache_option and compare it to the Cache Configuration screen in the UI). This is where the cache enablement for things like EAV, block_html or config are stored.
The module configuration files will be loaded in the order of Mage_All, Mage_* and then any external configuration files. There is the ability to run a limited version of Magento with only specific modules allowed, but that would seem to be implemented by creating a custom boostrap process.
The module XML files are merged into a temporary variable which is processed locally. It will be iterated over during which the dependencies of each module will be calculated in the method Mage_Core_Model_Config::_sortModuleDepends(). The module list is resorted and if any module dependencies for active modules cannot be satisfied an exception is thrown. The module configurations are sorted in reverse order, presumably because of the order that the files are loaded. The module configurations are then iterated over in normal order so that the base dependency of Mage_Core, which has no dependencies, is done first and the module dependencies are processed in order of greater to less importance. If a module with a dependency is defined but cannot be satisfied an exception is thrown at this point too.
Once the dependencies have been calculated a new configuration model is instantiated which will be merged into the mainline config. The sorted module dependency is iterated over and is retrieved from the local configuration object and merged into the sorted configuration. Once the module configuration has been iterated over the local configuration object is merged into the mainline configuration object.
At this point the individual configuration files for each of the modules will be read. Prior to reading them Magento will check to see if modules in the local code pool can be configured. If they are disabled then the directory for the local code pool will be removed from the include_path.
All the active modules are then iterated over and a quick check is done to make sure that if local modules are not allowed that they are not included, though it looks like this check is only done on the configuration and so technical something could be installed in the local code pool but state that they are in base or core and bypass this.
It will load the config.xml from the etc directory of each module and merge it into the main configuration.
Once the configuration object has done that it will re-load the app/etc/local.xml file to make sure that its directives have not been overridden.
The final part of the module configuration loading task if to find any configuration elements that extend any others. If there are (and I did not find any in the default configuration) the configuration array will copy the elements into the element being extended.
Once the configuration files have all been loaded and processed any system updates are executed (which I will not cover here). From there configuration options are loaded from the database via the loadToXml() method in Mage_Core_Model_Resource_Config, referenced from Mage_Core_Model_Config::loadDb().
When retrieving the database adapter for reading the configuration. It first checks the write adapter to see if it is in the middle of a transaction. If it is in a transaction it returns the write adapter and if it is not it returns the read adapter. This feature is part of Mage_Core_Model_Resource_Db_Abstract and as such is available to any resource that extends it. Kinda neat.
The first thing that loadToXml() does is load the website data into the configuration object. It does this by querying the core_website table for the website_id, code and name columns. This data is inserted into the configuration at /websites.
After the websites have been loaded the store configuration is inserted. The core_store table is queried for the store_id, code, name and website_id for the store and stored in the /stores location.
After the stores have been loaded the core configuration is loaded. This is done by querying the core_config_data. In the result set first the configuration items that have a scope of “default” are entered into the configuration. Then each of the websites configurations are iterated over and the values for the default scope are copied into each.
After the values for the default scope have been copied in then the nodes for the individual scope are copied into their respective paths. Yes, the configuration is copied to the various contexts by default. If you have a lot of these contexts you could end up with a very large configuration document.
After the configuration values have been copied into the web site configuration each of the stores for each of the websites have the values for their “owner” website copied into them. Then the configuration items for each individual individual store are copied into the store configuration node.
One interesting thing to note about this process is that if a website or store scope is the configuration but not in the list of stores or websites their configuration options are automatically deleted from the configuration.
At this point the configuration is stored in the cache as long as the cache is allowed to stored. The saving mechanism iterates over each of the sections to check the level of recursion required for each of the specified sections. This allows the system to cache individual sub-sections instead of having to cache the whole section. By default this is limited to stores so that each store can be referenced individually. Given the size of the configuration document, separating individual store config elements is a good thing. Once the specified sections have been added to the parts array the rest of the configuration is stored in the generic cache tag.
At this point the configuration has been loaded and cached. There is more to be done before the request can be dispatched but that will be covered in another document.
Retrieving configuration is done by requesting a path that corresponds to the XML structure similar to XPath, but without many of the options that XPath provides (thankfully). This is done by calling the getNode() method on the configuration object that the developer is working with. Calling this method will return the Varien_Simplexml_Element object that corresponds with the requested node or false if the node does not exist (I would argue null would be more appropriate, but se la vis).
To retrieve the appropriate configuration node the developer has several different options available to them. I would argue that it is best to get configuration object from the most specific components. Since the specificity goes from default -> website -> store it would be preferred to get a configuration node of “test/data/value” from Mage::getStore()->getConfig()->getNode(‘test/data/value’) it one is working with storefront-based functionality. If the logic is working on the website level then it would be Mage::getWebsite()->getConfig()->getNode(‘test/data/value’). In other words, when working within the store or website contexts the context of the configuration option requested should match rather than Mage::getConfig()->getNode(“stores/{$storeId}/test/data/value”).
Comments
Gerry
Thanks Kevin for in depth article on Mage store config saving. I have managed to save a multi-field (you can add values dynamically) but only in serialized format, any idea how to store these values non-serialized? Ie each field has its own record in core_config_data table?
Kevin Schroeder
If you are storing them in core_config_data you can’t store them individually. The whole purpose for the core_config_data table is to populate the Magento configuration object. It is not meant to be queried. For that reason you assign a backend type that allows you to store structured data in a serialized format. If you are looking for data that is stored in core_config_data you need to retrieve it from the configuration object, not the DB.