Skip to content

HtlmReader

class HtmlReader(htlm_content: str)

Overview

The HtmlReader class converts the html format into markdown. It is a child object from the generic IReader class.

Parameters

  • htlm_content : str
    • The html content.

Attributes

  • soup : BeautifulSoup
    • The BeautifulSoup object of the html content.

Methods

def convert_to_markdown(self) -> str
Convert the html content to a markdown string.

Usage Example

Code
htlm_content = '<h1>Header</h1><p>Paragraph</p>'
reader = HtmlReader(htlm_content)
markdown = reader.convert_to_markdown()
print(markdown)
Output
# Header

Paragraph