10. Brief Tour of the Standard Library
10. 标准库简介
10.1. Operating System Interface
10.1. 操作系统接口
The "os" module provides dozens of functions for interacting with the operating system:"os" 模块提供了许多与操作系统交互的函数:
>>> import osBe sure to use the "import os" style instead of "from os import *". This will keep "os.open()" from shadowing the built-in "open()" function which operates much differently.
>>> os.getcwd() # Return the current working directory
'C:\\Python39'
>>> os.chdir('/server/accesslogs') # Change current working directory
>>> os.system('mkdir today') # Run the command mkdir in the system shell
0
一定要使用 "import os" 而不是 "from os import *" 。这将避免内建的"open()" 函数被 "os.open()" 隐式替换掉,它们的使用方式大不相同。
The built-in "dir()" and "help()" functions are useful as interactive aids for working with large modules like "os":
内置的 "dir()" 和 "help()" 函数可用作交互式辅助工具,用于处理大型模块,如 "os":
>>> import osFor daily file and directory management tasks, the "shutil" module provides a higher level interface that is easier to use:
>>> dir(os)
>>> help(os)
对于日常文件和目录管理任务, "shutil" 模块提供了更易于使用的更高级别的接口:
>>> import shutil
>>> shutil.copyfile('data.db', 'archive.db')
'archive.db'
>>> shutil.move('/build/executables', 'installdir')
'installdir'
10.2. File Wildcards
10.2. 文件通配符
The "glob" module provides a function for making file lists from directory wildcard searches:"glob" 模块提供了一个在目录中使用通配符搜索创建文件列表的函数:
>>> import glob
>>> glob.glob('*.py')
['primes.py', 'random.py', 'quote.py']
10.3. Command Line Arguments
10.3. 命令行参数
Common utility scripts often need to process command line arguments. These arguments are stored in the "sys" module's argv attribute as a list. For instance the following output results from running "python demo.py one two three" at the command line:通用实用程序脚本通常需要处理命令行参数。这些参数作为列表存储在 "sys"模块的 argv 属性中。例如,以下输出来自在命令行运行 "python demo.pyone two three"
>>> import sysThe "argparse" module provides a mechanism to process command line arguments. It should always be preferred over directly processing "sys.argv" manually.
>>> print(sys.argv)
['demo.py', 'one', 'two', 'three']
"argparse" 模块提供了一种处理命令行参数的机制。 它应该总是优先于直接手工处理 "sys.argv"。
Take, for example, the below snippet of code:
以下面的代码片段为例:
>>> import argparse
>>> from getpass import getuser
>>> parser = argparse.ArgumentParser(description='An argparse example.')
>>> parser.add_argument('name', nargs='?', default=getuser(), help='The name of someone to greet.')
>>> parser.add_argument('--verbose', '-v', action='count')
>>> args = parser.parse_args()
>>> greeting = ["Hi", "Hello", "Greetings! its very nice to meet you"][args.verbose % 3]
>>> print(f'{greeting}, {args.name}')
>>> if not args.verbose:
>>> print('Try running this again with multiple "-v" flags!')
10.4. Error Output Redirection and Program Termination
10.4. 错误输出重定向和程序终止
The "sys" module also has attributes for stdin, stdout, and stderr. The latter is useful for emitting warnings and error messages to make them visible even when stdout has been redirected:"sys" 模块还具有 stdin , stdout 和 stderr 的属性。后者对于发出警告和错误消息非常有用,即使在 stdout 被重定向后也可以看到它们:
>>> sys.stderr.write('Warning, log file not found starting a new one\n')
Warning, log file not found starting a new one
The most direct way to terminate a script is to use "sys.exit()". 终止脚本的最直接方法是使用 "sys.exit()" 。
10.5. String Pattern Matching
10.5. 字符串模式匹配
The "re" module provides regular expression tools for advanced string processing. For complex matching and manipulation, regular expressions offer succinct, optimized solutions:"re" 模块为高级字符串处理提供正则表达式工具。对于复杂的匹配和操作,正则表达式提供简洁,优化的解决方案:
>>> import reWhen only simple capabilities are needed, string methods are preferred because they are easier to read and debug:
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'
当只需要简单的功能时,首选字符串方法因为它们更容易阅读和调试:
>>> 'tea for too'.replace('too', 'two')
'tea for two'
10.6. Mathematics
10.6. 数学
The "math" module gives access to the underlying C library functions for floating point math:"math" 模块提供对浮点数学的底层C库函数的访问:
>>> import mathThe "random" module provides tools for making random selections:
>>> math.cos(math.pi / 4)
0.70710678118654757
>>> math.log(1024, 2)
10.0
"random" 模块提供了进行随机选择的工具:
>>> import randomThe "statistics" module calculates basic statistical properties (the mean, median, variance, etc.) of numeric data:
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(range(100), 10) # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random() # random float
0.17970987693706186
>>> random.randrange(6) # random integer chosen from range(6)
4
"statistics" 模块计算数值数据的基本统计属性(均值,中位数,方差等):
>>> import statisticsThe SciPy project
>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
>>> statistics.mean(data)
1.6071428571428572
>>> statistics.median(data)
1.25
>>> statistics.variance(data)
1.3720238095238095
SciPy项目
10.7. Internet Access
10.7. 互联网访问
There are a number of modules for accessing the internet and processing internet protocols. Two of the simplest are "urllib.request" for retrieving data from URLs and "smtplib" for sending mail:有许多模块可用于访问互联网和处理互联网协议。其中两个最简单的"urllib.request" 用于从URL检索数据,以及 "smtplib" 用于发送邮件:
>>> from urllib.request import urlopen(Note that the second example needs a mailserver running on localhost.)
>>> with urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl') as response:
... for line in response:
... line = line.decode('utf-8') # Decoding the binary data to text.
... if 'EST' in line or 'EDT' in line: # look for Eastern Time
... print(line)
Nov. 25, 09:43:32 PM EST
>>> import smtplib
>>> server = smtplib.SMTP('localhost')
>>> server.sendmail('soothsayer@example.org', 'jcaesar@example.org',
... """To: jcaesar@example.org
... From: soothsayer@example.org
...
... Beware the Ides of March.
... """)
>>> server.quit()
(请注意,第二个示例需要在localhost上运行的邮件服务器。)
10.8. Dates and Times
10.8. 日期和时间
The "datetime" module supplies classes for manipulating dates and times in both simple and complex ways. While date and time arithmetic is supported, the focus of the implementation is on efficient member extraction for output formatting and manipulation. The module also supports objects that are timezone aware."datetime" 模块提供了以简单和复杂的方式操作日期和时间的类。虽然支持日期和时间算法,但实现的重点是有效的成员提取以进行输出格式化和操作。该模块还支持可感知时区的对象。
>>> # dates are easily constructed and formatted
>>> from datetime import date
>>> now = date.today()
>>> now
datetime.date(2003, 12, 2)
>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'
>>> # dates support calendar arithmetic
>>> birthday = date(1964, 7, 31)
>>> age = now - birthday
>>> age.days
14368
10.9. Data Compression
10.9. 数据压缩
Common data archiving and compression formats are directly supported by modules including: "zlib", "gzip", "bz2", "lzma", "zipfile" and "tarfile".常见的数据存档和压缩格式由模块直接支持,包括:"zlib", "gzip", "bz2","lzma", "zipfile" 和 "tarfile"。:
>>> import zlib
>>> s = b'witch which has which witches wrist watch'
>>> len(s)
41
>>> t = zlib.compress(s)
>>> len(t)
37
>>> zlib.decompress(t)
b'witch which has which witches wrist watch'
>>> zlib.crc32(s)
226805979
10.10. Performance Measurement
10.10. 性能测量
Some Python users develop a deep interest in knowing the relative performance of different approaches to the same problem. Python provides a measurement tool that answers those questions immediately.一些Python用户对了解同一问题的不同方法的相对性能产生了浓厚的兴趣。Python提供了一种可以立即回答这些问题的测量工具。
For example, it may be tempting to use the tuple packing and unpacking feature instead of the traditional approach to swapping arguments. The "timeit" module quickly demonstrates a modest performance advantage:
例如,元组封包和拆包功能相比传统的交换参数可能更具吸引力。"timeit" 模块可以快速演示在运行效率方面一定的优势:
>>> from timeit import TimerIn contrast to "timeit"'s fine level of granularity, the "profile" and "pstats" modules provide tools for identifying time critical sections in larger blocks of code.
>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()
0.57535828626024577
>>> Timer('a,b = b,a', 'a=1; b=2').timeit()
0.54962537085770791
与 "timeit" 的精细粒度级别相反, "profile" 和 "pstats" 模块提供了用于在较大的代码块中识别时间关键部分的工具。
10.11. Quality Control
10.11. 质量控制
One approach for developing high quality software is to write tests for each function as it is developed and to run those tests frequently during the development process.开发高质量软件的一种方法是在开发过程中为每个函数编写测试,并在开发过程中经常运行这些测试。
The "doctest" module provides a tool for scanning a module and validating tests embedded in a program's docstrings. Test construction is as simple as cutting-and-pasting a typical call along with its results into the docstring. This improves the documentation by providing the user with an example and it allows the doctest module to make sure the code remains true to the documentation:
"doctest" 模块提供了一个工具,用于扫描模块并验证程序文档字符串中嵌入的测试。测试构造就像将典型调用及其结果剪切并粘贴到文档字符串一样简单。这通过向用户提供示例来改进文档,并且它允许doctest模块确保代码保持对文档的真实:
def average(values):The "unittest" module is not as effortless as the "doctest" module, but it allows a more comprehensive set of tests to be maintained in a separate file:
"""Computes the arithmetic mean of a list of numbers.
>>> print(average([20, 30, 70]))
40.0
"""
return sum(values) / len(values)
import doctest
doctest.testmod() # automatically validate the embedded tests
"unittest" 模块不像 "doctest" 模块那样易于使用,但它允许在一个单独的文件中维护更全面的测试集:
import unittest
class TestStatisticalFunctions(unittest.TestCase):
def test_average(self):
self.assertEqual(average([20, 30, 70]), 40.0)
self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
with self.assertRaises(ZeroDivisionError):
average([])
with self.assertRaises(TypeError):
average(20, 30, 70)
unittest.main() # Calling from the command line invokes all tests
10.12. Batteries Included
10.12. 自带电池
Python has a "batteries included" philosophy. This is best seen through the sophisticated and robust capabilities of its larger packages. For example:Python有“自带电池”的理念。通过其包的复杂和强大功能可以最好地看到这一点。例如:
• "xmlrpc.client" 和 "xmlrpc.server" 模块使远程过程调用实现了几乎无关 紧要的任务。尽管有模块名称,但不需要直接了解或处理XML。
• "email" 包是一个用于管理电子邮件的库,包括MIME和其他:基于 RFC2822* 的邮件文档。与 "smtplib" 和 "poplib" 实际上发送和接收消息不同,电子邮件包具有完整的工具集,用于构建或解码复杂的消息结构(包括附件)以及实现互联网编码和标头协议。
• "json" 包为解析这种流行的数据交换格式提供了强大的支持。 "csv" 模块 支持以逗号分隔值格式直接读取和写入文件,这些格式通常由数据库和电子表 格支持。 XML处理由 "xml.etree.ElementTree" , "xml.dom" 和"xml.sax" 包支持。这些模块和软件包共同大大简化了Python应用程序和其他工具之间的 数据交换。
• "sqlite3" 模块是SQLite数据库库的包装器,提供了一个可以使用稍微非标准 的SQL语法更新和访问的持久数据库。
• 国际化由许多模块支持,包括 "gettext" , "locale" ,以及 "codecs"包 。