DOM (Document Object Module)

DOMs are a convention for representation and interaction between objects in HTML documents.

The Document Object Model, or “DOM,” is a cross-language API from the World Wide Web Consortium (W3C) for accessing and modifying XML documents. (details)

In the context of Machine Learning, understanding DOMs is important for implementing processes such as Web Crawlers.

Python Example

# A web page DOM contains representations of structure, style, and content.
import xml.dom.minidom

# Define a test DOM document.
document = """\
<slideshow>
<title>Demo slideshow</title>
<slide><title>Slide title</title>
<point>This is a demo</point>
<point>Of a program for processing slides</point>
</slide>

<slide><title>Another demo slide</title>
<point>It is important</point>
<point>To have more than</point>
<point>one slide</point>
</slide>
</slideshow>
"""

# Create a DOM object.
dom = xml.dom.minidom.parseString(document)

# Access a DOM object element.
title = dom.getElementsByTagName("title")