Skip to content
This repository was archived by the owner on Jun 5, 2025. It is now read-only.

Conversation

@aponcedeleonch
Copy link
Member

Closes: #1218

Check if the description for a new persona is different enough from the existing personas descriptions. This is done to correctly differentiate between personas

@aponcedeleonch aponcedeleonch requested review from JAORMX and ptelang March 5, 2025 11:21
Closes: #1218

Check if the description for a new persona is different enough from
the existing personas descriptions. This is done to correctly
differentiate between personas
@aponcedeleonch aponcedeleonch force-pushed the prevent-similar-personas branch from befb7b4 to 490e355 Compare March 5, 2025 11:21
)
# If the distance is less than the threshold, the persona description is too similar
if persona_distance.distance < self._persona_diff_desc_threshold:
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How expensive is it to parallelize this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried 2 approaches to get the distances:

  1. Direct query to sqlite and get the distances
  2. Query to get all the personas, then use numpy matrices operations to get the distance

The result of the experiment was that it didn't matter, both of them were practically onpar.

For this specific comparison for just checking the threshold probably makes no difference to parallelize it with matrices operations. I don't expect someone having 1000 different personas in their DB. If it happens then yes, we would need for optimization. Probably with a sensible amount of personas (<10) really makes no difference

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take an extreme but not unreasonable example: 100 personas. Would we start seeing issues in this case?

@aponcedeleonch aponcedeleonch merged commit da69ec0 into main Mar 5, 2025
11 checks passed
@aponcedeleonch aponcedeleonch deleted the prevent-similar-personas branch March 5, 2025 12:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task]: Create Personas with descriptions

3 participants