今天我们将学习如何在python中将XML转换为JSON和XML转换为Dict。我们可以使用python的xmltodict
模块来读取XML文件并将其转换为Dict或JSON数据。我们还可以流大型XML文件并将其转换为字典。在进入编码部分之前,让我们先了解为什么需要XML转换。
将XML转换为Dict/JSON
XML 文件已经慢慢变得过时,但在网络上有相当大的系统仍然使用这种格式. XML 比 JSON更沉重,因此,大多数开发人员更喜欢后者在他们的应用程序中。
开始使用 xmltodict
我們可以開始使用「xmltodict」模塊,但我們必須先安裝它,我們將主要使用 pip進行安裝。
安装 xmltodict 模块
以下是我们如何使用 Python Package Index (pip)来安装 xmltodict 模块:
1pip install xmltodict
This will be done quickly as xmltodict
is a very light weight module. Here is the output for this installation: The best thing about this installation was that this module is not dependent on any other external module and so, it is light-weight and avoids any version conflicts. Just to demonstrate, on Debian based systems, this module can be easily installed using the
apt
tool:
1sudo apt install python-xmltodict
另一个附加点是,这个模块有一个 官方的 Debian 包。
Python XML 到 JSON
开始尝试此模块的最佳场所将是执行一项主要执行的操作,执行XML到JSON转换。
1import xmltodict
2import pprint
3import json
4
5my_xml = """
6 <audience>
7 <id what="attribute">123</id>
8 <name>Shubham</name>
9 </audience>
10"""
11
12pp = pprint.PrettyPrinter(indent=4)
13pp.pprint(json.dumps(xmltodict.parse(my_xml)))
Let's see the output for this program: Here, we simply use the
parse(...)
function to convert XML data to JSON and then we use the json
module to print JSON in a better format.
将 XML 文件转换为 JSON
保持 XML 数据在代码本身并不总是可能的,也不是现实的. 通常,我们将我们的数据保存在数据库或某些文件中. 我们可以直接选择文件并将其转换为 JSON。
1import xmltodict
2import pprint
3import json
4
5with open('person.xml') as fd:
6 doc = xmltodict.parse(fd.read())
7
8pp = pprint.PrettyPrinter(indent=4)
9pp.pprint(json.dumps(doc))
Let's see the output for this program: Here, we used another module pprint to print the output in a formatted manner. Apart from that, using the
open(...)
function was straightforward, we used it get a File descriptor and then parsed the file into a JSON object.
Python XML 用来写字
正如模块名称所暗示的那样,xmltodict实际上会将我们提供的XML数据转换为简单的Python字典(/community/tutorials/python-dictionary)。
1import xmltodict
2import pprint
3import json
4
5my_xml = """
6 <audience>
7 <id what="attribute">123</id>
8 <name>Shubham</name>
9 </audience>
10"""
11my_dict = xmltodict.parse(my_xml)
12print(my_dict['audience']['id'])
13print(my_dict['audience']['id']['@what'])
Let's see the output for this program: So, the tags can be used as the keys along with the attribute keys as well. The attribute keys just need to be prefixed with the
@
symbol.
支持 XML 中的名称空间
在XML数据中,我们通常有一个名空间的集合,定义了XML文件提供的数据的范围。在转换到JSON格式时,这些名空间也必须在JSON格式中保持。
1<root xmlns="https://defaultns.com/"
2 xmlns:a="https://a.com/">
3 <audience>
4 <id what="attribute">123</id>
5 <name>Shubham</name>
6 </audience>
7</root>
以下是我们如何在 JSON 格式中包含 XML 名称空间的示例程序:
1import xmltodict
2import pprint
3import json
4
5with open('person.xml') as fd:
6 doc = xmltodict.parse(fd.read(), process_namespaces=True)
7
8pp = pprint.PrettyPrinter(indent=4)
9pp.pprint(json.dumps(doc))
Let's see the output for this program:
JSON 到 XML 转换
虽然从XML转换为JSON是本模块的主要目标,xmltodict还支持进行逆操作,将JSON转换为XML形式。
1import xmltodict
2
3student = {
4 "data" : {
5 "name" : "Shubham",
6 "marks" : {
7 "math" : 92,
8 "english" : 99
9 },
10 "id" : "s387hs3"
11 }
12}
13
14print(xmltodict.unparse(student, pretty=True))
Let's see the output for this program: Please note that giving a single JSON key is necessary for this to work correctly. If we consider that we modify our program to contain multiple JSON keys at the very first level of data like:
1import xmltodict
2
3student = {
4 "name" : "Shubham",
5 "marks" : {
6 "math" : 92,
7 "english" : 99
8 },
9 "id" : "s387hs3"
10}
11
12print(xmltodict.unparse(student, pretty=True))
In this case, we have three keys at the root level. If we try to unparse this form of JSON, we will face this error: This happens because xmltodict needs to construct the JSON with the very first key as the root XML tag. This means that there should only be a single JSON key at the root level of data.
结论
在本课中,我们研究了一种优秀的Python模块,可以用来解析和转换XML到JSON,反之亦然,我们还学会了如何使用xmltodict模块将XML转换为Dict。