Notebook

In [1]:

%load_ext watermark
%watermark -a 'cs224' -u -d -v -p openai,redlines,markdown_it,mdformat

Author: cs224

Last updated: 2023-05-08

Python implementation: CPython
Python version       : 3.8.12
IPython version      : 8.2.0

openai     : 0.27.6
redlines   : 0.2.2
markdown_it: 2.2.0
mdformat   : 0.7.16

In [2]:

from IPython.display import display, HTML, Markdown

from IPython.display import display_html
def display_side_by_side(*args):
    html_str=''
    for df in args:
        if type(df) == np.ndarray:
            df = pd.DataFrame(df)
        html_str+=df.to_html()
    html_str = html_str.replace('table','table style="display:inline"')
    # print(html_str)
    display_html(html_str,raw=True)

CSS = """
.output {
    flex-direction: row;
}
"""

def display_graphs_side_by_side(*args):
    html_str='<table><tr>'
    for g in args:
        html_str += '<td>'
        html_str += g._repr_svg_()
        html_str += '</td>'
    html_str += '</tr></table>'
    display_html(html_str,raw=True)
    

display(HTML("<style>.container { width:70% !important; }</style>"))

In [3]:

from redlines import Redlines

In [4]:

import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY')

In [5]:

def get_completion(prompt, model="gpt-3.5-turbo", temperature=0): 
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, 
    )
    return response.choices[0].message["content"]

In [6]:

blog_post = """
## Rational

Recently, I participated in the free [ChatGPT Prompt Engineering for
Developers](https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers) course by Andrew Ng and Isa Fulford. This triggered
an older urge to use an AI to help me fine tune text that I writte. While simple grammar and spelling mistakes can be spoted by MS Word and similar I
wanted the AI to also make my text more compelling and enhance its appeal. The AI should help me to ensure a smooth and engaging reading
experience. Below you can read my first steps towards that goal.

## Get Started

Before you start you will need a paid account on [platform.openai.com](https://platform.openai.com/), because only then you'll get access to the API
version of ChatGPT. Some users already got beta access to `GPT-4`, but in general you will only have access to `gpt-3.5-turbo` with the API. For our
purposes here that is good enough. And don't worry about the costs. This is only a couple of cents even for extended uses of the API. I never managed
to get over $1 so far. Actually all my efforts to get the code for this blog post working only cost USD 0.02.

Once you have acces to [platform.openai.com](https://platform.openai.com/) you will need to create an API key and put it in a `.env` file like so:

```
OPENAI_API_KEY=sk-...
```

## The Code

### Jupyter Notebook / Python Init

I am using a [Juypter](https://jupyter.org/) notebook and python to access the API. The below explains step by step what I was doing.

You start by setting up the API and defining a helper function:

```python
import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY')

def get_completion(prompt, model="gpt-3.5-turbo", temperature=0):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature,
    )
    return response.choices[0].message["content"]
```


After that I define a little markdown text block as a playground. **Replace the single backtick, single quote, single backtick sequence with a triple
backtick sequence**:


```python
mark_down = '''
# Header 1
Some text under header 1.

## Header 2
More text under header 2.

`' `python
import pigpio

handle = pi.i2c_open(1, 0x58)

def horter_byte_sequence(channel, voltage):
    voltage = int(voltage * 100.0)

    output_buffer = bytearray(3)

    high_byte = voltage >> 8
    low_byte  = voltage & 0xFF;
    output_buffer[0] = (channel & 0xFF)
    output_buffer[1] = low_byte
    output_buffer[2] = high_byte

    return output_buffer

v = horter_byte_sequence(0, 5.0)
pi.i2c_write_device(handle, v)
`' `

### Header 3
Even more text under header 3.

## Another Header 2
Text under another header 2.
'''
```

### Split Markdown at Headings

It helps to have a [syntax](https://daringfireball.net/projects/markdown/syntax) reference for markdown close. As you can read on
[learn.microsoft.com](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/chatgpt?pivots=programming-language-chat-completions):

> The token limit for gpt-35-turbo is 4096 tokens. These limits include the token count from both the message array sent and the model response. The
> number of tokens[^numtokens] in the messages array combined with the value of the max_tokens parameter must stay under these limits or you'll receive an error.

This means that you can't send your whole blog post in one go to the chatgpt API, but you have to process it in pieces. In addition the approach to
process the blog post in pieces will also make it easier for you later to integrate the suggestions of the AI into your blog post.

I did some searching and it was not as easy as I hoped for to find a python library that allowed me to easily split the input blog post at
headings. Finally I ended up using [markdown-it-py](https://github.com/executablebooks/markdown-it-py). But `markdown_it` is meant to be used to
translate markdown to HTML and out of the box does not work as markdown to markdown converter. After some digging I found at the bottom of its
[using](https://markdown-it-py.readthedocs.io/en/latest/using.html) documentation page that you can use
[mdformat](https://github.com/executablebooks/mdformat) in combination with `markdown_it`.

In addition I remove any `fence` token that anyway does not belong to the standard text flow of the blog post. This results in the following helper
function:

```python
import markdown_it
import mdformat.renderer

def extract_md_sections(md_input_txt):
    md = markdown_it.MarkdownIt()
    options = {}
    env = {}
    md_renderer = mdformat.renderer.MDRenderer()

    tokens = md.parse(md_input_txt)
    md_input_txt = md_renderer.render(tokens, options, env)
    tokens = md.parse(md_input_txt)

    sections = []
    current_section = []

    for token in tokens:
        if token.type == 'heading_open':
            if current_section:
                sections.append(current_section)
            current_section = [token]
        elif token.type == 'fence':
            continue
        elif current_section is not None:
            current_section.append(token)

    if current_section:
        sections.append(current_section)

    sections = [md_renderer.render(section, options, env) for section in sections]
    return sections
```

If you wander about the double call to `md.parse()`: I am not sure if this is strictly necessary, but I noticed that some `fence` tokens might be
missed otherwise.

Now you can try the splitting function on our dummy mardkown text block:


```python
sections = extract_md_sections(mark_down)
for section in sections:
    print(section)
    print('---')
```


And should see the following result:
```
# Header 1

Some text under header 1.

---
## Header 2

More text under header 2.

---
### Header 3

Even more text under header 3.

---
## Another Header 2

Text under another header 2.

---
```
"""

In [7]:

mark_down = '''
# Header 1
Some text under header 1.

## Header 2
More text under header 2.

```python
import pigpio

handle = pi.i2c_open(1, 0x58)

def horter_byte_sequence(channel, voltage):
    voltage = int(voltage * 100.0)

    output_buffer = bytearray(3)

    high_byte = voltage >> 8
    low_byte  = voltage & 0xFF;
    output_buffer[0] = (channel & 0xFF)
    output_buffer[1] = low_byte
    output_buffer[2] = high_byte

    return output_buffer

v = horter_byte_sequence(0, 5.0)
pi.i2c_write_device(handle, v)
```

### Header 3
Even more text under header 3.

## Another Header 2
Text under another header 2.
'''

In [8]:

import markdown_it
import mdformat.renderer

In [9]:

def extract_md_sections(md_input_txt):
    md = markdown_it.MarkdownIt()
    options = {}
    env = {}
    md_renderer = mdformat.renderer.MDRenderer()
    
    tokens = md.parse(md_input_txt)
    md_input_txt = md_renderer.render(tokens, options, env)
    tokens = md.parse(md_input_txt)
    
    sections = []
    current_section = []

    for token in tokens:
        if token.type == 'heading_open':
            if current_section:
                sections.append(current_section)
            current_section = [token]
        elif token.type == 'fence':
            continue
        elif current_section is not None:
            current_section.append(token)

    if current_section:
        sections.append(current_section)

    sections = [md_renderer.render(section, options, env) for section in sections]
    return sections

In [10]:

sections = extract_md_sections(mark_down)
for section in sections:
    print(section)
    print('---')    

# Header 1

Some text under header 1.

---
## Header 2

More text under header 2.

---
### Header 3

Even more text under header 3.

---
## Another Header 2

Text under another header 2.

---

In [11]:

sections = extract_md_sections(blog_post)
for section in sections:
    print(section)
    print('-----------------------------------------------------------------------------------------------------------')    

## Rational

Recently, I participated in the free [ChatGPT Prompt Engineering for
Developers](https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers) course by Andrew Ng and Isa Fulford. This triggered
an older urge to use an AI to help me fine tune text that I writte. While simple grammar and spelling mistakes can be spoted by MS Word and similar I
wanted the AI to also make my text more compelling and enhance its appeal. The AI should help me to ensure a smooth and engaging reading
experience. Below you can read my first steps towards that goal.

-----------------------------------------------------------------------------------------------------------
## Get Started

Before you start you will need a paid account on [platform.openai.com](https://platform.openai.com/), because only then you'll get access to the API
version of ChatGPT. Some users already got beta access to `GPT-4`, but in general you will only have access to `gpt-3.5-turbo` with the API. For our
purposes here that is good enough. And don't worry about the costs. This is only a couple of cents even for extended uses of the API. I never managed
to get over $1 so far. Actually all my efforts to get the code for this blog post working only cost USD 0.02.

Once you have acces to [platform.openai.com](https://platform.openai.com/) you will need to create an API key and put it in a `.env` file like so:

-----------------------------------------------------------------------------------------------------------
## The Code

-----------------------------------------------------------------------------------------------------------
### Jupyter Notebook / Python Init

I am using a [Juypter](https://jupyter.org/) notebook and python to access the API. The below explains step by step what I was doing.

You start by setting up the API and defining a helper function:

After that I define a little markdown text block as a playground. **Replace the single backtick, single quote, single backtick sequence with a triple
backtick sequence**:

-----------------------------------------------------------------------------------------------------------
### Split Markdown at Headings

It helps to have a [syntax](https://daringfireball.net/projects/markdown/syntax) reference for markdown close. As you can read on
[learn.microsoft.com](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/chatgpt?pivots=programming-language-chat-completions):

> The token limit for gpt-35-turbo is 4096 tokens. These limits include the token count from both the message array sent and the model response. The
> number of tokens\[^numtokens\] in the messages array combined with the value of the max_tokens parameter must stay under these limits or you'll receive an error.

This means that you can't send your whole blog post in one go to the chatgpt API, but you have to process it in pieces. In addition the approach to
process the blog post in pieces will also make it easier for you later to integrate the suggestions of the AI into your blog post.

I did some searching and it was not as easy as I hoped for to find a python library that allowed me to easily split the input blog post at
headings. Finally I ended up using [markdown-it-py](https://github.com/executablebooks/markdown-it-py). But `markdown_it` is meant to be used to
translate markdown to HTML and out of the box does not work as markdown to markdown converter. After some digging I found at the bottom of its
[using](https://markdown-it-py.readthedocs.io/en/latest/using.html) documentation page that you can use
[mdformat](https://github.com/executablebooks/mdformat) in combination with `markdown_it`.

In addition I remove any `fence` token that anyway does not belong to the standard text flow of the blog post. This results in the following helper
function:

If you wander about the double call to `md.parse()`: I am not sure if this is strictly necessary, but I noticed that some `fence` tokens might be
missed otherwise.

Now you can try the splitting function on our dummy mardkown text block:

And should see the following result:

-----------------------------------------------------------------------------------------------------------

In [12]:

i = 0
print(sections[i])

## Rational

Recently, I participated in the free [ChatGPT Prompt Engineering for
Developers](https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers) course by Andrew Ng and Isa Fulford. This triggered
an older urge to use an AI to help me fine tune text that I writte. While simple grammar and spelling mistakes can be spoted by MS Word and similar I
wanted the AI to also make my text more compelling and enhance its appeal. The AI should help me to ensure a smooth and engaging reading
experience. Below you can read my first steps towards that goal.

In [13]:

# The result of this first step should not contain any markup and ignore embedded images no matter if the images are embeddedd via Markdown or HTML tags.
prompt = f"""
Below a text delimited by `'` is provided to you. The text is a snipet from a blog post written in a mix of Markdown and HTML markup.

As a first step extract the pure text. In this first step keep the markup for ordered or unordered list but pay close attention to remove all other markup and especially ignore embedded images no matter if the images are embeddedd via Markdown or HTML tags.

As a second step use the output of the first step and ensure that newlines are only used to separate sections and at the end of enumration items of an ordered or unordered list. 

Provide as your response the output of the second step.

`'`{sections[i]}`'`
"""

response = get_completion(prompt)
print(response)

## Rational

Recently, I participated in the free ChatGPT Prompt Engineering for Developers course by Andrew Ng and Isa Fulford. This triggered an older urge to use an AI to help me fine tune text that I writte. While simple grammar and spelling mistakes can be spoted by MS Word and similar I wanted the AI to also make my text more compelling and enhance its appeal. The AI should help me to ensure a smooth and engaging reading experience. Below you can read my first steps towards that goal.

In [14]:

prompt = f"""Proofread and correct the following section of a blog post. Stay as close as possible to the original and only make modifications to correct grammar or spelling mistakes. Text: ```{response}```"""
response2 = get_completion(prompt)
# display(Markdown(response))
print(response2)

## Rational

Recently, I participated in the free ChatGPT Prompt Engineering for Developers course by Andrew Ng and Isa Fulford. This triggered an older urge to use an AI to help me fine-tune text that I write. While simple grammar and spelling mistakes can be spotted by MS Word and similar, I wanted the AI to also make my text more compelling and enhance its appeal. The AI should help me ensure a smooth and engaging reading experience. Below, you can read my first steps towards that goal.

In [15]:

diff = Redlines(response,response2)
display(Markdown(diff.output_markdown))

Rational¶

Recently, I participated in the free ChatGPT Prompt Engineering for Developers course by Andrew Ng and Isa Fulford. This triggered an older urge to use an AI to help me fine tune fine-tune text that I writte. write. While simple grammar and spelling mistakes can be spoted spotted by MS Word and similar similar, I wanted the AI to also make my text more compelling and enhance its appeal. The AI should help me to ensure a smooth and engaging reading experience. Below Below, you can read my first steps towards that goal.

In [16]:

# prompt = f"""
# Revise the following blog post excerpt, maintaining its original length and style, while enhancing its appeal for a technical hobbyist audience. 
# Improve the reading experience to be smoother and more engaging by making a minimal set of modifications to the original. 
# Text: ```{response2}```
# """
# prompt = f"""
# Refine the following blog post excerpt with minimal alterations, preserving its original length and style, while enhancing its appeal for a discerning reader. Make limited yet impactful changes for a smooth and engaging reading experience. Text: ```{response2}```
# """

prompt = f"""
Below a text delimited by triple quotes is provided to you. The text is a snipet from a blog post.

Walk through the blog post snipet paragraph by paragraph and make a few limited yet impactful changes for a smooth and engaging reading experience targeting a technical hobbyist audience.

Text: ```{response2}```
"""

response3 = get_completion(prompt)
print(response3)

## Introduction

As a technical hobbyist, I'm always on the lookout for ways to improve my writing skills. Recently, I had the opportunity to participate in the free ChatGPT Prompt Engineering for Developers course by Andrew Ng and Isa Fulford. This course reignited my interest in using AI to fine-tune my writing. While tools like MS Word can catch simple grammar and spelling mistakes, I wanted an AI that could take my writing to the next level. I wanted to create a more compelling and engaging reading experience for my audience. In this blog post, I'll share my first steps towards achieving that goal.

In [17]:

# ------------------------------------------------------------------

In [18]:

i = 1
print(sections[i])

## Get Started

Before you start you will need a paid account on [platform.openai.com](https://platform.openai.com/), because only then you'll get access to the API
version of ChatGPT. Some users already got beta access to `GPT-4`, but in general you will only have access to `gpt-3.5-turbo` with the API. For our
purposes here that is good enough. And don't worry about the costs. This is only a couple of cents even for extended uses of the API. I never managed
to get over $1 so far. Actually all my efforts to get the code for this blog post working only cost USD 0.02.

Once you have acces to [platform.openai.com](https://platform.openai.com/) you will need to create an API key and put it in a `.env` file like so:

In [19]:

# The result of this first step should not contain any markup and ignore embedded images no matter if the images are embeddedd via Markdown or HTML tags.
prompt = f"""
Below a text delimited by `'` is provided to you. The text is a snipet from a blog post written in a mix of Markdown and HTML markup.

As a first step extract the pure text. In this first step keep the markup for ordered or unordered list but pay close attention to remove all other markup and especially ignore embedded images no matter if the images are embeddedd via Markdown or HTML tags.

As a second step use the output of the first step and ensure that newlines are only used to separate sections and at the end of enumration items of an ordered or unordered list. 

Provide as your response the output of the second step.

`'`{sections[i]}`'`
"""

response = get_completion(prompt)
print(response)

## Get Started

Before you start you will need a paid account on platform.openai.com, because only then you'll get access to the API version of ChatGPT. Some users already got beta access to GPT-4, but in general you will only have access to gpt-3.5-turbo with the API. For our purposes here that is good enough. And don't worry about the costs. This is only a couple of cents even for extended uses of the API. I never managed to get over $1 so far. Actually all my efforts to get the code for this blog post working only cost USD 0.02.

Once you have acces to platform.openai.com you will need to create an API key and put it in a `.env` file like so:

In [20]:

prompt = f"""Proofread and correct the following section of a blog post. Stay as close as possible to the original and only make modifications to correct grammar or spelling mistakes. Text: ```{response}```"""
response2 = get_completion(prompt)
# display(Markdown(response))
print(response2)

## Get Started

Before you start, you will need a paid account on platform.openai.com because only then will you get access to the API version of ChatGPT. Some users have already received beta access to GPT-4, but in general, you will only have access to gpt-3.5-turbo with the API. For our purposes here, that is good enough. And don't worry about the costs. This is only a couple of cents, even for extended uses of the API. I have never managed to spend over $1 so far. Actually, all my efforts to get the code for this blog post working only cost USD 0.02.

Once you have access to platform.openai.com, you will need to create an API key and put it in a `.env` file like so:```

In [21]:

diff = Redlines(response,response2)
display(Markdown(diff.output_markdown))

Get Started¶

Before you start start, you will need a paid account on platform.openai.com, platform.openai.com because only then you'll will you get access to the API version of ChatGPT. Some users have already got received beta access to GPT-4, but in general general, you will only have access to gpt-3.5-turbo with the API. For our purposes here here, that is good enough. And don't worry about the costs. This is only a couple of cents cents, even for extended uses of the API. I have never managed to get spend over $1 so far. Actually Actually, all my efforts to get the code for this blog post working only cost USD 0.02.

Once you have acces access to platform.openai.com platform.openai.com, you will need to create an API key and put it in a .env file like so:so:```

In [22]:

# prompt = f"""
# Revise the following blog post excerpt, maintaining its original length and style, while enhancing its appeal for a technical hobbyist audience. 
# Improve the reading experience to be smoother and more engaging by making a minimal set of modifications to the original. 
# Text: ```{response2}```
# """
# prompt = f"""
# Refine the following blog post excerpt with minimal alterations, preserving its original length and style, while enhancing its appeal for a discerning reader. Make limited yet impactful changes for a smooth and engaging reading experience. Text: ```{response2}```
# """

prompt = f"""
Below a text delimited by triple quotes is provided to you. The text is a snipet from a blog post.

Walk through the blog post snipet paragraph by paragraph and make a few limited yet impactful changes for a smooth and engaging reading experience targeting a technical hobbyist audience.

Text: ```{response2}```
"""

response3 = get_completion(prompt)
print(response3)

## Getting Started with ChatGPT API

To get started with ChatGPT API, you will need a paid account on platform.openai.com. This will give you access to the API version of ChatGPT. Although some users have already received beta access to GPT-4, for our purposes here, gpt-3.5-turbo with the API is good enough. Don't worry about the costs, as it only costs a couple of cents, even for extended uses of the API. In fact, I have never spent over $1 so far. To get the code for this blog post working, I only spent USD 0.02.

To create an API key, you need to log in to platform.openai.com and follow the instructions. Once you have created the API key, put it in a `.env` file as shown below:`````

In [23]:

# ------------------------------------------------------------------

In [24]:

i = 4
print(sections[i])

### Split Markdown at Headings

In addition I remove any `fence` token that anyway does not belong to the standard text flow of the blog post. This results in the following helper
function:

If you wander about the double call to `md.parse()`: I am not sure if this is strictly necessary, but I noticed that some `fence` tokens might be
missed otherwise.

Now you can try the splitting function on our dummy mardkown text block:

And should see the following result:

In [25]:

# The result of this first step should not contain any markup and ignore embedded images no matter if the images are embeddedd via Markdown or HTML tags.
prompt = f"""
Below a text delimited by `'` is provided to you. The text is a snipet from a blog post written in a mix of Markdown and HTML markup.

As a first step extract the pure text. In this first step keep the markup for ordered or unordered list but pay close attention to remove all other markup and especially ignore embedded images no matter if the images are embeddedd via Markdown or HTML tags.

As a second step use the output of the first step and ensure that newlines are only used to separate sections and at the end of enumration items of an ordered or unordered list. 

Provide as your response the output of the second step.

`'`{sections[i]}`'`
"""

response = get_completion(prompt)
print(response)

Split Markdown at Headings

It helps to have a syntax reference for markdown close. As you can read on The token limit for gpt-35-turbo is 4096 tokens. These limits include the token count from both the message array sent and the model response. The number of tokens in the messages array combined with the value of the max_tokens parameter must stay under these limits or you'll receive an error.

This means that you can't send your whole blog post in one go to the chatgpt API, but you have to process it in pieces. In addition the approach to process the blog post in pieces will also make it easier for you later to integrate the suggestions of the AI into your blog post.

I did some searching and it was not as easy as I hoped for to find a python library that allowed me to easily split the input blog post at headings. Finally I ended up using markdown-it-py. But markdown_it is meant to be used to translate markdown to HTML and out of the box does not work as markdown to markdown converter. After some digging I found at the bottom of its using documentation page that you can use mdformat in combination with markdown_it.

In addition I remove any fence token that anyway does not belong to the standard text flow of the blog post. This results in the following helper function:

If you wander about the double call to md.parse(): I am not sure if this is strictly necessary, but I noticed that some fence tokens might be missed otherwise.

Now you can try the splitting function on our dummy mardkown text block:

And should see the following result:

In [26]:

prompt = f"""Proofread and correct the following section of a blog post. Stay as close as possible to the original and only make modifications to correct grammar or spelling mistakes. Text: ```{response}```"""
response2 = get_completion(prompt)
# display(Markdown(response))
print(response2)

Split Markdown at Headings

It helps to have a syntax reference for Markdown close. As you can read on, the token limit for GPT-35-Turbo is 4096 tokens. These limits include the token count from both the message array sent and the model response. The number of tokens in the messages array combined with the value of the max_tokens parameter must stay under these limits, or you'll receive an error.

This means that you can't send your whole blog post in one go to the ChatGPT API, but you have to process it in pieces. In addition, the approach to process the blog post in pieces will also make it easier for you later to integrate the suggestions of the AI into your blog post.

I did some searching, and it was not as easy as I hoped to find a Python library that allowed me to easily split the input blog post at headings. Finally, I ended up using markdown-it-py. But markdown_it is meant to be used to translate Markdown to HTML and out of the box does not work as a Markdown to Markdown converter. After some digging, I found at the bottom of its using documentation page that you can use mdformat in combination with markdown_it.

In addition, I remove any fence token that does not belong to the standard text flow of the blog post. This results in the following helper function:

If you wonder about the double call to md.parse(): I am not sure if this is strictly necessary, but I noticed that some fence tokens might be missed otherwise.

Now you can try the splitting function on our dummy Markdown text block:

And should see the following result:

In [27]:

diff = Redlines(response,response2)
display(Markdown(diff.output_markdown))

Split Markdown at Headings

It helps to have a syntax reference for markdown Markdown close. As you can read on The on, the token limit for gpt-35-turbo GPT-35-Turbo is 4096 tokens. These limits include the token count from both the message array sent and the model response. The number of tokens in the messages array combined with the value of the max_tokens parameter must stay under these limits limits, or you'll receive an error.

This means that you can't send your whole blog post in one go to the chatgpt ChatGPT API, but you have to process it in pieces. In addition addition, the approach to process the blog post in pieces will also make it easier for you later to integrate the suggestions of the AI into your blog post.

I did some searching searching, and it was not as easy as I hoped for to find a python Python library that allowed me to easily split the input blog post at headings. Finally Finally, I ended up using markdown-it-py. But markdown_it is meant to be used to translate markdown Markdown to HTML and out of the box does not work as markdown to markdown a Markdown to Markdown converter. After some digging digging, I found at the bottom of its using documentation page that you can use mdformat in combination with markdown_it.

In addition addition, I remove any fence token that anyway does not belong to the standard text flow of the blog post. This results in the following helper function:

If you wander wonder about the double call to md.parse(): I am not sure if this is strictly necessary, but I noticed that some fence tokens might be missed otherwise.

Now you can try the splitting function on our dummy mardkown Markdown text block:

And should see the following result:

In [28]:

# prompt = f"""
# Revise the following blog post excerpt, maintaining its original length and style, while enhancing its appeal for a technical hobbyist audience. 
# Improve the reading experience to be smoother and more engaging by making a minimal set of modifications to the original. 
# Text: ```{response2}```
# """
# prompt = f"""
# Refine the following blog post excerpt with minimal alterations, preserving its original length and style, while enhancing its appeal for a discerning reader. Make limited yet impactful changes for a smooth and engaging reading experience. Text: ```{response2}```
# """

prompt = f"""
Below a text delimited by triple quotes is provided to you. The text is a snipet from a blog post.

Walk through the blog post snipet paragraph by paragraph and make a few limited yet impactful changes for a smooth and engaging reading experience targeting a technical hobbyist audience.

Text: ```{response2}```
"""

response3 = get_completion(prompt)
print(response3)

Splitting Markdown at Headings

For technical hobbyists, having a syntax reference for Markdown close by is always helpful. As you read on, keep in mind that the token limit for GPT-35-Turbo is 4096 tokens, which includes the token count from both the message array sent and the model response. If the number of tokens in the messages array combined with the value of the max_tokens parameter exceeds these limits, you'll receive an error.

This means that you can't send your entire blog post in one go to the ChatGPT API. Instead, you have to process it in pieces. This approach will also make it easier for you to integrate the AI's suggestions into your blog post later on.

I searched for a Python library that would allow me to easily split the input blog post at headings, but it wasn't as easy as I had hoped. Eventually, I ended up using markdown-it-py. However, markdown_it is meant to be used to translate Markdown to HTML and out of the box does not work as a Markdown to Markdown converter. After some digging, I found out that you can use mdformat in combination with markdown_it.

Additionally, I remove any fence token that does not belong to the standard text flow of the blog post. This results in the following helper function:

If you're wondering about the double call to md.parse(), I'm not entirely sure if it's necessary, but I noticed that some fence tokens might be missed otherwise.

Now, you can try the splitting function on our dummy Markdown text block and see the following result:

In [ ]: