Changelog:
4.6.3 (20180812)
- Exactly the same as 4.6.2. Re-released to make the README file render properly on PyPI.
4.6.2 (20180812)
- Fix an exception when a custom formatter was asked to format a void element.
4.6.1 (20180728)
- Stop data loss when encountering an empty numeric entity, and possibly in other cases.
- Preserve XML namespaces introduced inside an XML document, not just the ones introduced at the top level.
- Added a new formatter, "html5", which represents void elements as "<element>" rather than "<element/>".
- Fixed a problem where the html.parser tree builder interpreted a string like "&foo " as the character entity "&foo;"
- Correctly handle invalid HTML numeric character entities like “ which reference code points that are not Unicode code points. Note that this is only fixed when Beautiful Soup is used with the html.parser parser -- html5lib already worked and I couldn't fix it with lxml.
- Improved the warning given when no parser is specified.
- When markup contains duplicate elements, a select() call that includes multiple match clauses will match all relevant elements.
- Fixed code that was causing deprecation warnings in recent Python 3 versions.
- Fixed a Windows crash in diagnose() when checking whether a long markup string is a filename.
- Stopped HTMLParser from raising an exception in very rare cases of bad markup.
- Fixed a bug where find_all() was not working when asked to find a tag with a namespaced name in an XML document that was parsed as HTML.
- You can get finer control over formatting by subclassing bs4.element.Formatter and passing a Formatter instance into (e.g.) encode().
- You can pass a dictionary of attrs into BeautifulSoup.new_tag. This makes it possible to create a tag with an attribute like 'name' that would otherwise be masked by another argument of new_tag.
- Clarified the deprecation warning when accessing tag.fooTag, to cover the possibility that you might really have been looking for a tag called 'fooTag'.