OPEN AI

Official Site (Most Useful Tool)

1
2
3
4
5
6
7
8
9
10
11
Install
$ pip install openai
API Key
import os
import openai

# Load your API key from an environment variable or secret management service
openai.api_key = os.getenv("OPENAI_API_KEY")

chat_completion = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello world"}])
$ openai api chat_completions.create -m gpt-3.5-turbo -g user "Hello world"

Models

chat

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import openai

openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "The 2020 World Series was played in Texas at Globe Life Field in Arlington.",
"role": "assistant"
}
}
],
"created": 1677664795,
"id": "chatcmpl-7QyqpwdfhqwajicIEznoc6Q47XAyW",
"model": "gpt-3.5-turbo-0613",
"object": "chat.completion",
"usage": {
"completion_tokens": 17,
"prompt_tokens": 57,
"total_tokens": 74
}
}
response['choices'][0]['message']['content']

Completions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import openai

response = openai.Completion.create(
model="text-davinci-003",
prompt="Write a tagline for an ice cream shop."
)
{
"choices": [
{
"finish_reason": "length",
"index": 0,
"logprobs": null,
"text": "\n\n\"Let Your Sweet Tooth Run Wild at Our Creamy Ice Cream Shack"
}
],
"created": 1683130927,
"id": "cmpl-7C9Wxi9Du4j1lQjdjhxBlO22M61LD",
"model": "text-davinci-003",
"object": "text_completion",
"usage": {
"completion_tokens": 16,
"prompt_tokens": 10,
"total_tokens": 26
}
}
response['choices'][0]['text']

Image Generations

create

1
2
3
4
5
6
response = openai.Image.create(
prompt="a white siamese cat",
n=1,
size="1024x1024"
)
image_url = response['data'][0]['url']

Edit

1
2
3
4
5
6
7
8
response = openai.Image.create_edit(
image=open("sunlit_lounge.png", "rb"),
mask=open("mask.png", "rb"),
prompt="A sunlit indoor lounge area with a pool containing a flamingo",
n=1,
size="1024x1024"
)
image_url = response['data'][0]['url']

Variation

1
2
3
4
5
6
response = openai.Image.create_variation(
image=open("corgi_and_cat_paw.png", "rb"),
n=1,
size="1024x1024"
)
image_url = response['data'][0]['url']

FINE-TUNING (NOTE ME)

Prepare training data

Prompt & answer should be in JSON format.

1
2
3
4
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
...

Create Fine Tuned Model

1
openai api fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m <BASE_MODEL>

Model you can choose: ada, babbage, curie, or davinci
And it will return a model ID.

Create Completion

Then use your own model Id to create Completion!

1
2
3
4
import openai
openai.Completion.create(
model=FINE_TUNED_MODEL,
prompt=YOUR_PROMPT)

Embedding (NOTE ME)

What is Embedding?

Text embedding allows us to directly convert text documents to vectors with a simple API call with Open AI.

1
2
3
4
5
6
7
8
9
10
11
12
13
def get_embedding(text):
result = openai.Embedding.create(
model='text-embedding-ada-002',
input=text
)
return result["data"][0]["embedding"]
[
-0.006929283495992422,
-0.005336422007530928,
...
-4.547132266452536e-05,
-0.024047505110502243
]

Examples

Let’s see an example together!
unicorn.csv

ai

1
2
3
4
5
6
7
8
9
10
11
12
13
14
df = pd.read_csv("unicorns.csv") 

def summary(company,crunchbase_url,city,country,industry,investor_list):
investors = 'The investors in the company are'

for investor in ast.literal_eval(investor_list):
investors += f" {investor}, "

text = f"{company} has headquarters in {city} in {country} and is in the field of {industry}. {investors}. You can find more information at {crunchbase_url}"

return text

df['summary'] = df.apply(lambda df: summary(df['Company'],df['Crunchbase Url'],df['City'],df['Country'],df['Industry'],df['Investors']),axis=1)
Esusu has headquarters in New York in United States and is in the field of Fintech. The investors in the company are Next Play Ventures, Zeal Capital Partners, SoftBank Group, . You can find more information at https://www.cbinsights.com/company/esusu'

Remember to calculate tokens first

1
2
3
4
5
6
7
8
9
def get_embedding(text):
# Note how this function assumes you already set your Open AI key!
result = openai.Embedding.create(
model='text-embedding-ada-002',
input=text
)
return result["data"][0]["embedding"]

df['embedding'] = df['summary'].apply(get_embedding)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
df = pd.read_csv("unicorns_with_embeddings.csv") 

prompt = "What does the company Momenta do and who invested in it?"
prompt_embedding = get_embedding(prompt)
# There are other services/programs for larger amount of vectors
# Take a look at vector search engines like Pinecone or Weaviate
def vector_similarity(vec1, vec2):
"""
Returns the similarity between two vectors.

Because OpenAI Embeddings are normalized to length 1, the cosine similarity is the same as the dot product.
"""
array_vec1 = np.array(ast.literal_eval(vec1))
array_vec2 = np.array(vec2)
return np.dot(array_vec1, array_vec2)

df["prompt_similarity"] = df['embedding'].apply(lambda vector: vector_similarity(vector, prompt_embedding))

print(df.sort_values("prompt_similarity", ascending=False).head())

# Could also use sort_values() with ascending=False, but nlargest should be more performant
print(df.nlargest(1,'prompt_similarity').iloc[0]['summary'])

Let’s MAKE a COMPLETE VERSION!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import openai 
import pandas as pd
import tiktoken # https://github.com/openai/tiktoken
import ast
import numpy as np

df = pd.read_csv("embeddings_unicorns.csv")
def get_embedding(text):
result = openai.Embedding.create(
model='text-embedding-ada-002',
input=text
)
return result["data"][0]["embedding"]

def vector_similarity(vec1, vec2):
"""
Returns the similarity between two vectors.

Because OpenAI Embeddings are normalized to length 1, the cosine similarity is the same as the dot product.
"""
array_vec1 = np.array(ast.literal_eval(vec1))
array_vec2 = np.array(vec2)
return np.dot(array_vec1, array_vec2)

def embed_prompt_lookup():
# initial question
question = input("What question do you have about a Unicorn company? ")
# Get embedding
prompt_embedding = get_embedding(question)
# Get prompt similarity with embeddings
# Note how this will overwrite the prompt similarity column each time!
df["prompt_similarity"] = df['embedding'].apply(lambda vector: vector_similarity(vector, prompt_embedding))

# get most similar summary
summary = df.nlargest(1,'prompt_similarity').iloc[0]['summary']
#summary = df.sort_values("prompt_similarity", ascending=False).head()

prompt = f"""Only answer the question below if you have 100% certainty of the facts, use the context below to answer.
Here is some context:
{summary}
Q: {question}
A:"""

response = openai.Completion.create(
prompt=prompt,
temperature=0,
max_tokens=500,
model="text-davinci-003"
)
print(response["choices"][0]["text"].strip(" \n"))

embed_prompt_lookup()

FAQ
https://help.openai.com/en/articles/6824809-embeddings-frequently-asked-questions

Whisper (Speech To Text)

1
2
3
4
5
6
7
8
# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai
audio_file= open("/path/to/file/audio.mp3", "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)
{
"text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger.
....
}

How to calculate TOKENS

WHY CALCULATE

It is important to evaluate your cost before calling, especially for TEXT EMBEDDING & FINE TUNING.
NOTE HERE: Different Models use Different Encodings!
tiktoken supports 3 different encodings for OpenAI models:
“gpt2” for most gpt-3 models
“p50k_base” for code models, and Davinci models, like “text-davinci-003”
“cl100k_base” for text-embedding-ada-002

HOW TO USE

1
2
3
4
5
6
7
8
9
import tiktoken # https://github.com/openai/tiktoken

def num_tokens_from_string(string, encoding_name):
"""Returns the number of tokens in a text string."""
encoding = tiktoken.get_encoding(encoding_name)
num_tokens = len(encoding.encode(string))
return num_tokens
num_tokens_from_string(df['summary'][0],encoding_name='cl100k_base')
df['token_count'].sum() * 0.0004 / 1000

A cute example from me: Auto Generated QUIZ

ai2

Stocks Chat Bot

Use chat GPT to analyze a stock

economic moats, how the company cam scale?

SWOT ANALISIS

a fact based, data-driven evaluation of a firm’s strengths,weaknesses,opportunities and threats.

1
prompt:write me a swot analisis on TESLA

summarize earning calls

Earning Calls are a key source of valuable information for investors.
Earning Calls: discuss recent performance,discuss the outlook for the industry, answers questions from analysts and so on.

1
prompt: give me top 15 points of this: "web link"

Evaluate ESG
ESG: environment, social, government

1
prompt:evaluate tesla's ESG credentials

generate codes to backtest Buy and Sell Signals

Backtesting allows a trader to simulate a trading strategy.
a buy signal is an event or condition selcted by a trader or investor to purchase
a sell signal is anything that alerts a trader to sell an asset
The RSI Indicator is a tool that helps people who buy & sell stocks or other things figure out if something costs too much or too little.
background:

1
Low RSI levels, below 30, generate buy signals and indicate an oversold or undervalued condition. High RSI levels, above 70, generate sell signals and suggest that a security is overbought or overvalued. A reading of 50 denotes a neutral level or balance between bullish and bearish positions.
1
2
prompt: write me code to see how tesla's stock price
performs one month after its RSI dips below 30.

also, key risks;

1
prompt: identify the key risks of TESLA stock

Coding:

Simple Moving Average(SMA)

the avearage closing price of a given security over a specific number of days.

1
2
3
4
5
6
7
8
9
10
11
prompt:generate pine script code to create a simple moving average of the closing price with a period of 14 days

// This source code is subject to the terms of the Mozilla Public License 2.0 at https://mozilla.org/MPL/2.0/
// © AlexNo_17

//@version=5
indicator("My script")
length = 14
sma_value = ta.sma(close, length)

plot(sma_value, color=color.maroon, title="SMA")

ai3

red line(actually maroon) in the bottom is what we add!

volume

1
2
3
4
5
6
7
8
9
prompt: generate pine script code that turns the background color green when volume is 50% or greater above the volume move average with a period of 14 days.
//@version=5
indicator("My script")
length = 14
vol_ma = ta.sma(volume, length)

condition = volume >= 1.5 * vol_ma

bgcolor(condition ? color.green : na)

ai4

automate your trading strategy with chatGPT

Entry and Exit Strategy

1
prompt: generate pine script code to enter on a macd crossover.write in version 5.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
strategy(title="MACD Crossover Entry", shorttitle="MACD Entry", overlay=true)

// Input parameters
fastLength = input.int(defval=12, title="Fast Length")
slowLength = input.int(defval=26, title="Slow Length")
signalLength = input.int(defval=9, title="Signal Length")

// Calculate MACD
[macdLine, signalLine, _] = ta.macd(close, fastLength, slowLength, signalLength)

// Entry condition: MACD crossover
longCondition = ta.crossover(macdLine, signalLine)
shortCondition = ta.crossunder(macdLine, signalLine)

// Plot entry markers
plotshape(series=longCondition, location=location.belowbar, color=color.green, style=shape.triangleup, size=size.small)
plotshape(series=shortCondition, location=location.abovebar, color=color.red, style=shape.triangledown, size=size.small)

ai5

What is Position Sizing?

refers to the number of units invested in a particular security by an investor or trader. an investor’s account size & risk tolerance should be taken into account when determining appropriate position sizing.

1
2
prompt:generate pine script code to determine position size based on a risk tolerance of 2% per trade, an account size of $1000,and a risk  to reward ratio of 2 to 1. write in version 5

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
indicator("Position Size Calculator", overlay = true)

// Inputs
riskTolerance = input(2, title = "Risk Tolerance (%)")
accountSize = input(1000, title = "Account Size ($)")
riskRewardRatio = input(2, title = "Risk to Reward Ratio")

// Calculate maximum risk amount per trade
riskAmount = (riskTolerance / 100) * accountSize

// Calculate stop loss distance (example value, replace with your logic)
stopLossDistance = ta.atr(14) * 2

// Calculate position size
positionSize = (riskAmount / stopLossDistance) * riskRewardRatio

// Plot position size
plot(positionSize, title = "Position Size", color = color.blue, style = plot.style_histogram)

ai6

black line below is what we add

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
prompt: generate a pine script strategy backtest that enters into trades on MACD crosses,determines position size based on a risk tolerance of 2% per trade, an account size of $1000, and a risk to reward ratio of 2:1,uses a 5% stop loss order and a 2.5% trailing stop loss order with a take profit order of 10%.
strategy("MACD Cross Strategy", overlay = true)

// Inputs
riskTolerance = input(2, title = "Risk Tolerance (%)")
accountSize = input(1000, title = "Account Size ($)")
riskRewardRatio = input(2, title = "Risk to Reward Ratio")
stopLossPercentage = input(5, title = "Stop Loss (%)")
trailingStopPercentage = input(2.5, title = "Trailing Stop (%)")
takeProfitPercentage = input(10, title = "Take Profit (%)")
fastLength = input(12, title = "Fast Length")
slowLength = input(26, title = "Slow Length")
signalLength = input(9, title = "Signal Length")

// Calculate maximum risk amount per trade
riskAmount = (riskTolerance / 100) * accountSize

// Calculate stop loss and take profit distances
atrValue = ta.atr(14)
stopLossDistance = atrValue * stopLossPercentage / 100
trailingStopDistance = atrValue * trailingStopPercentage / 100
takeProfitDistance = atrValue * takeProfitPercentage / 100

// Calculate position size based on risk and stop loss
positionSize = (riskAmount / stopLossDistance) * riskRewardRatio

// MACD calculation
[macdLine, signalLine, _] = ta.macd(close, fastLength, slowLength, signalLength)

// Entry conditions
longCondition = ta.crossover(macdLine, signalLine)
shortCondition = ta.crossunder(macdLine, signalLine)

// Exit conditions
longExitCondition = ta.highest(high, 10) - high > trailingStopDistance
shortExitCondition = low - ta.lowest(low, 10) > trailingStopDistance

// Strategy orders
if (longCondition)
strategy.entry("Long", strategy.long, qty = positionSize)

if (shortCondition)
strategy.entry("Short", strategy.short, qty = positionSize)

if (longExitCondition)
strategy.close("Long")

if (shortExitCondition)
strategy.close("Short")

// Plot signals on chart
plotshape(series = longCondition, style = shape.triangleup, location = location.belowbar, color = color.green)
plotshape(series = shortCondition, style = shape.triangledown, location = location.abovebar, color = color.red)

ai7

build a stock market & day trading bot using chatGPT

VWAP
What is VWAP?
volume-weighted average price
a measure, used in technical analysis, of a security’s average price over a trading period.
It’s a trading benchmark that represents the average price a security has traded at throughout the day, based on both volume and price.
VWAP is important because it provides traders with pricing insight into both the trend and value of a security.
VOLUME indicator
definition:
a technical analysis tool which helps traders evaluate the strength of the current market trend by measuring the cumulative volume over a specified period of time.
fomula:
volume strength = volume of a stock/period in time
traders look to volume to determine liquidity
go to trading view—indicator—search vwap (by neolao),—look at the source code—in version 3, out dated

1
prompt: use the code provided to create a strategy in pinescript to buy when the prices cross over vwap, and short when the price crossed below vwap.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
//@version=5
strategy("VWAP", overlay=true)

// There are five steps in calculating VWAP:
//
// 1. Calculate the Typical Price for the period. [(High + Low + Close)/3)]
// 2. Multiply the Typical Price by the period Volume (Typical Price x Volume)
// 3. Create a Cumulative Total of Typical Price. Cumulative(Typical Price x Volume)
// 4. Create a Cumulative Total of Volume. Cumulative(Volume)
// 5. Divide the Cumulative Totals.
//
// VWAP = Cumulative(Typical Price x Volume) / Cumulative(Volume)

cumulativePeriod = input(14, "Period")

typicalPrice = (high + low + close) / 3
typicalPriceVolume = typicalPrice * volume
cumulativeTypicalPriceVolume = ta.cum(typicalPriceVolume)
cumulativeVolume = ta.cum(volume)
vwapValue = cumulativeTypicalPriceVolume / cumulativeVolume

plot(vwapValue, color=color.blue)

// Define conditions for strategy entry and exit
longCondition = ta.crossover(close, vwapValue)
shortCondition = ta.crossunder(close, vwapValue)

// Execute the strategy
// Execute the strategy
if (longCondition)
strategy.entry("Buy", strategy.long)

if (shortCondition)
strategy.entry("Short", strategy.short)