Skip to content

Commit b4f9ef9

Browse files
authored
Merge pull request #42 from A-Baji/dev
changelog, handle new naming system
2 parents c6a36e8 + a89c1d6 commit b4f9ef9

File tree

6 files changed

+82
-16
lines changed

6 files changed

+82
-16
lines changed

CHANGELOG.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Changelog
2+
3+
Observes [Semantic Versioning](https://semver.org/spec/v2.0.0.html) standard and [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) convention.
4+
5+
## [2.0.0] - 06-15-2023
6+
7+
### Added
8+
9+
- a changelog
10+
- confirmation dialogue for model deletion
11+
12+
### Changed
13+
14+
- flag to use an existing dataset for model creation to allow manual revision
15+
- user input for model creation updated to work with new discord naming system
16+
- tweaked readme
17+
18+
## [1.2.2] - 02-22-2023
19+
20+
### Fixed
21+
22+
- bug with end of thought punctuation
23+
24+
## [1.2.1] - 02-19-2023
25+
26+
### Fixed
27+
28+
- bug with min and max thought length
29+
30+
## [1.2.0] - 02-17-2023
31+
32+
### Added
33+
34+
- events flag for job status command
35+
- min and max thought length parameters for model creation
36+
37+
## [1.1.0] - 02-11-2023
38+
39+
### Changed
40+
41+
- replaced `subprocess` instances with appropriate openAI python API methods
42+
43+
### Removed
44+
45+
- dataset clean up step with openAI fine tune API
46+
47+
## [1.0.1] - 02-11-2023
48+
49+
### Changed
50+
51+
- switched to `pathlib` for file path parsing
52+
53+
[2.0.0]: https://github.com/A-Baji/discordAI-modelizer/compare/1.2.2...2.0.0
54+
[1.2.2]: https://github.com/A-Baji/discordAI-modelizer/compare/1.2.1...1.2.2
55+
[1.2.1]: https://github.com/A-Baji/discordAI-modelizer/compare/1.2.0...1.2.1
56+
[1.2.0]: https://github.com/A-Baji/discordAI-modelizer/compare/1.1.0...1.2.0
57+
[1.1.0]: https://github.com/A-Baji/discordAI-modelizer/compare/1.0.1...1.1.0
58+
[1.0.1]: https://github.com/A-Baji/discordAI-modelizer/compare/1.0.0...1.0.1

README.md

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# DiscordAI Modelizer
2-
DiscordAI Modelizer is a python package that can generate custom openai models based on a discord user's chat history in a discord channel. It uses [DiscordChatExporter](https://github.com/Tyrrrz/DiscordChatExporter) to download the logs of a channel. Then, after the logs are processed into a usable dataset, it uses [openAI's API](https://beta.openai.com/docs/introduction) to create a customized model. It also wraps some of the tools from the openAI API to help with managing customizations.
2+
DiscordAI Modelizer is a python package that can generate custom openai models based on a discord user's chat history in a discord channel. It uses [DiscordChatExporter](https://github.com/Tyrrrz/DiscordChatExporter) to download the logs of a channel, processes the logs into a usable dataset, and then uses [openAI's API](https://beta.openai.com/docs/introduction) to create a customized model. It also wraps some of the tools from the openAI API to help with managing customizations.
33

44
DiscordAI Modelizer is primarily used as a subcomponent of [DiscordAI](https://github.com/A-Baji/discordAI), but may also be used independently.
55

@@ -14,25 +14,26 @@ DiscordAI Modelizer is primarily used as a subcomponent of [DiscordAI](https://g
1414
### Model
1515
Commands related to your openAI models
1616
#### `discordai_modelizer model list`
17-
List your openAi customized models
17+
* List your openAI models
1818
#### `discordai_modelizer model create`
19-
Create a new openAI customized model by downloading the specified chat logs, parsing them into a usable dataset, and then training a customized model using openai
20-
21-
For a proper usage, see the [guide](https://github.com/A-Baji/discordAI#create-a-new-customized-openai-model) for DiscordAI.
19+
* Create a new openAI customized model by downloading the specified chat logs, parsing them into a usable dataset, and then training a customized model using openai
20+
* For proper usage, see the [DiscordAI guide](https://github.com/A-Baji/discordAI#create-a-new-customized-openai-model)
2221
#### `discordai_modelizer model delete`
23-
Delete an openAI customized model
22+
* Delete an openAI model
2423
### Job
2524
Commands related to your openAI jobs
2625
#### `discordai_modelizer job list`
27-
List your openAI customization jobs
26+
* List your openAI customization jobs
2827
#### `discordai_modelizer job follow`
29-
Follow an openAI customization job
28+
* Follow an openAI customization job
3029
#### `discordai_modelizer job status`
31-
Get an openAI customization job's status
30+
* Get an openAI customization job's status
3231
#### `discordai_modelizer job cancel`
33-
Cancel an openAI customization job
32+
* Cancel an openAI customization job
3433

3534
## Disclaimer
3635
This application allows users to download the chat history of any channel for which they have permission to invite a bot, and then use those logs to create an openai model based on a user's chat messages. It is important to note that this application should only be used with the consent of all members of the channel. Using this application for malicious purposes, such as impersonation, or without the consent of all members is strictly prohibited.
3736

38-
By using this application, you agree to use it responsibly. The developers of this application are not responsible for any improper use of the application or any consequences resulting from such use. We strongly discourage using this application for any unethical purposes.
37+
By using this application, you agree to use it responsibly. The developers of this application are not responsible for any improper use of the application or any consequences resulting from such use. We strongly discourage using this application for any unethical purposes.
38+
39+
This application is not affiliated with or endorsed by Discord, Inc. The use of the term "Discord" in our product name is solely for descriptive purposes to indicate compatibility with the Discord platform.

discordai_modelizer/customize.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,8 +58,7 @@ def create_model(bot_token: str, openai_key: str, channel_id: str, user_id: str,
5858
file=open(full_dataset_path, "rb"),
5959
purpose='fine-tune'
6060
)
61-
file_id = upload_response.id
62-
fine_tune=openai.FineTune.create(api_key=openai_key, training_file=file_id, model=base_model, suffix=user_id)
61+
fine_tune=openai.FineTune.create(api_key=openai_key, training_file=upload_response.id, model=base_model, suffix=user_id)
6362
print(f"INFO: Fine tune job id: {fine_tune.id}")
6463
print("INFO: This may take a few minutes to hours depending on the size of the dataset and the selected base model")
6564
print("INFO: Use the `job status` command to check on the status of job process")

discordai_modelizer/gen_dataset.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,22 @@
44
from datetime import timedelta
55
from dateutil import parser
66
from string import punctuation
7+
from os import path
78
import pathlib
89

910
def parse_logs(file: str, channel:str, user: str, thought_time=10, thought_max: int = None, thought_min=4):
1011
files_path = pathlib.Path(user_data_dir(appname="discordai"))
1112
dataset = open(files_path / f"{channel}_{user}_data_set.jsonl", 'w')
1213
thought_max = 999999 if not thought_max else thought_max
14+
try:
15+
username, user_id = user.split('#')
16+
except ValueError:
17+
username, user_id = user, None
1318
with open(file, 'r', encoding='utf-8') as data_file:
1419
data = load(data_file)
1520
messages = [msg for msg in data['messages']
16-
if f"{msg['author']['name']}#{msg['author']['discriminator']}" == user]
21+
if f'''{msg['author']['name']}{f"#{msg['author']['discriminator']}" if user_id else ''}''' ==
22+
f"{username}{f'#{user_id}' if user_id else ''}"]
1723
thought = ''
1824
for i, msg in enumerate(messages):
1925
msg['content'] = sub(
@@ -46,6 +52,8 @@ def parse_logs(file: str, channel:str, user: str, thought_time=10, thought_max:
4652
dumps(
4753
{'prompt': '', 'completion': f'{thought}'
4854
if thought[-1] == '.' else f'{thought}.'}) + "\n")
55+
if path.getsize(files_path / f"{channel}_{user}_data_set.jsonl") == 0:
56+
print("WARNING: The resulting dataset is empty. Please double check your parameters.")
4957
dataset.close()
5058

5159

discordai_modelizer/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "1.2.2"
1+
__version__ = "2.0.0"

docker-compose.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# VERSION=$(cat discordai_modelizer/version.py | grep -oP '\d+\.\d+\.\d+') docker compose up --build
22
# discordai_modelizer model list --simple
33
# discordai_modelizer model create -d $DISCORD_TOKEN -c $CHANNEL_ID -u "$USERNAME"
4-
# discordai_modelizer model delete -m "text-babbage:001"
4+
# discordai_modelizer model delete -m "text-babbage-001"
55
# discordai_modelizer job list --simple
66
# discordai_modelizer job follow -j ft-V31oOgRGZaVZJvZNQFSvSRBl
77
# discordai_modelizer job status -j ft-V31oOgRGZaVZJvZNQFSvSRBl --events

0 commit comments

Comments
 (0)