# Introduction

As what is called, Configuration is used to store all kinds of configurations in the hadoop platform, either they are from files(like core-default.xml) or from users(set via conf.setInt("dfs.replication", 1)). It would also warn you if you use a deprecated key. So how does it work? I will try to explain it in the source code level.

# Serialization and Deserialization

Configuration can be serialized in the file system and deserialized again into an instance. It implements the Writable interface to achieve this. There are only two methods in the Writable interface, write and readFields, just as follows.

As you can see, we call write when we need to serialize a Configuration instance into file, and we call readFields when we need to deserialize it from file. In fact, I wrote several lines to show how to serialize and deserialize a Configuration instance.

To run it, you have to create a maven project and add hadoop-common as a dependency.

Run it, and you will notice that a file confi.ser is created out of it, it stores the instance of Configuration, then we load it(aka deserialize it) from the file and get the instance. We can look through the source code of write and readFields implemented in Configuration to know more about it.

As you can see, the fields that serialization and deserialization apply to are this.properties and this.updateResource, the former stores all the configurations, which is the most important field in Configuration, and the latter stores the mapping of key to the resource which modifies or loads the key most recently. For example, if Configuration loads a file configuration.xml, which modifies the configuration dfs.replication, a new item will be added to this.updateResource:

# Detect Deprecated Keys

When Configuartion is loaded, a default list of deprecated keys will be loaded into defaultDeprecations too.

When you try to set a configuration via something like configuration.set("name", "value"), it will first check if the key provided is deprecated, and if it is, it will store both deprecated and new keys in itself with the given value, and warn once to the user that the key should not be used.

When Configuration is loaded, it will try to find two files in the classpath: core-default.xml and core-site.xml, then load them if they are found.