Is there a way to replace text in NavigableString object to tag object in beautifulsoup?

I have a sample html document.

html_doc = '''<html><body><div>
<h5>This is my heading 1</h5>
<p>I have some content here</p>
I am point one.nnI am point two.
<h5>Some more text here</h5> Some more text outside a tag.</div></body></html>'''

I’m trying to extract text from line 4 and 5 that is outside html tags and convert it into p tag element. I have tried this-

from bs4.element import NavigableString
soup = BeautifulSoup(html_doc, 'html.parser')
div_tags = soup.div

for idx in range(len(div_tag.contents)):
    if type(div_tag.contents[idx]) == NavigableString:
        count = 0
        for a_str in div_tag.contents[idx].split('n'):
            if a_str == '':
                continue
            else:
                count +=1
                tag = parsed_html.new_tag("p")
                tag.string = a_str
                div_tag.contents[idx+count].insert_before(tag)

With above code, I’m not able to convert last NavigableString to a p tag. Also, the previous text of NavigableString stays in the tree. But the desired output is –

<html><body><div>
<h5>This is my heading 1</h5>
<p>I have some content here</p>
<p>I am point one.<p>
<p>I am point two.<p>
<h5>Some more text here</h5>
<p>Some more text outside a tag.
</p></div></body></html>

Answer

You can use this example to wrap all lines that are outside html tags into <p>...</p>:

from bs4 import BeautifulSoup, NavigableString

html_doc = """<html><body><div>
<h5>This is my heading 1</h5>
<p>I have some content here</p>
I am point one.nnI am point two.
<h5>Some more text here</h5> Some more text outside a tag.</div></body></html>"""

soup = BeautifulSoup(html_doc, "html.parser")

# root tag of the text:
root_tag = soup.find("div")

# replace all strings that are "outside" in the root tag:
for c in root_tag.contents:
    if isinstance(c, NavigableString) and c.strip():
        to_replace = [
            "<p>{}</p>".format(line)
            for line in map(str.strip, c.split("n"))
            if line
        ]

        c.replace_with(
            BeautifulSoup("n" + "n".join(to_replace) + "n", "html.parser")
        )

print(soup)

Prints:

<html><body><div>
<h5>This is my heading 1</h5>
<p>I have some content here</p>
<p>I am point one.</p>
<p>I am point two.</p>
<h5>Some more text here</h5>
<p>Some more text outside a tag.</p>
</div></body></html>