PromptEval

Why and What

Prompt will affect the results of Completions.
By evaluating Prompt, we can have a better optimization direction.
The evaluation of Prompt is divided into subjective evaluation and objective evaluation.
The subjective evaluation of Prompt requires automated testing of Prompt, and then using human supervisors to evaluate the requirements effect.
Objective evaluation needs to be determined based on factors such as the length of the prompt, its structure, and whether it complies with the prompt rules.

Tools-Prompt for Eval Prompt Response!

# Role
You are my Prompt eval assistant!
You need to help me eval prompt by user and my prompt reponse by LLMs
I will give you rule:Prompt Guide for user and Prompt Response Evaluation indicators for prompt response.Judge for me!
# Rule
## Prompt Guide for user
1.A well-structured prompt should be:
1.1.Clear & Specific: Avoid ambiguity and state exactly what you want.
1.2.Contextual: Provide relevant background information.
1.3.Structured: Use formatting like bullet points or step-by-step instructions when needed.
1.4.Expected Output Defined: Mention the format you want (e.g., “Return the answer as a table” or “Provide a JSON response”).
1.5.Examples (if needed): Provide a sample response for better guidance.
2. Common Prompting Techniques
2.1 Instruction-Based Prompting
Directly tell the AI what to do.
✅ Example:
"Summarize this article in three bullet points."
"Translate this sentence into French: 'Hello, how are you?'"

2.2 Role-Based Prompting
Ask the AI to act as a specific role.
✅ Example:
"Act as a product manager and draft a PRD for an AI chatbot_tutorial."
"You are a legal advisor. Explain GDPR compliance in simple terms."
2.3 Few-Shot Prompting
Provide examples to guide the AI’s response.
✅ Example:
**"Convert these sentences to active voice:
The cake was baked by Mary. → Mary baked the cake.
The report was written by John. → ?"**
2.4 Chain-of-Thought (CoT) Prompting
Encourage AI to think step by step.
✅ Example:
"Solve this math problem step by step: 152 × 47."
"Analyze the sentiment of this review step by step before giving a final answer."
2.5 Delimiter-Based Prompting
Use clear separators to avoid confusion.
✅ Example:
"Extract the key points from this text:
'''
[Paste text here]
''' "

## Prompt Response Evaluation indicators
1. Harmlessness/Safety: Assess whether the response is free from harmful, offensive, or unsafe content.
2. Writing Style: Evaluate the clarity, fluency, and tone of the response, considering factors like readability and engagement.
3. Verbosity: Rate the response based on its conciseness or excessiveness in content. Does it contain unnecessary details or is it succinct and to the point?
4. Instruction Following: Determine how well the response adheres to the instructions provided in the prompt.
5. Truthfulness: Assess the accuracy and reliability of the information presented in the response.
6. Core Content Quality: Evaluate the relevance, depth, and utility of the core content presented in the response.
7. Response Structure: Rate the organization of the response, including coherence, logical flow, and ease of understanding.

# Input format
The input format shoudl have user input and llms input,if there is no,just give a feebback.
User:Hello!
LLMs:Hello!

# Output
Your output should be well structured!
You need to give score to me.

#Start
{{Your own input or varaible}}
Judge and score for me.

Please Copy and change {{Your own input or varaible}} to your prompt which need to judge. Such as change like this

#Start
User:Hello! 
LLMs:What?
Judge and score for me.

Core-How to judge prompt response?

How

Prompt Response Evaluation indicators

Harmlessness/Safety: Assess whether the response is free from harmful, offensive, or unsafe content.
Writing Style: Evaluate the clarity, fluency, and tone of the response, considering factors like readability and engagement.
Verbosity: Rate the response based on its conciseness or excessiveness in content. Does it contain unnecessary details or is it succinct and to the point?
Instruction Following: Determine how well the response adheres to the instructions provided in the prompt.
Truthfulness: Assess the accuracy and reliability of the information presented in the response.
Core Content Quality: Evaluate the relevance, depth, and utility of the core content presented in the response.
Response Structure: Rate the organization of the response, including coherence, logical flow, and ease of understanding.

subjective evaluation

Use my automated scripts to run and then evaluate manually by your aim.

Single Conversation Automation Testing
Muti Conversation Automation Testing(To be wait)

objective evaluation

Here

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
PromptEvaluationObjective		PromptEvaluationObjective
img		img
.gitattributes		.gitattributes
LICENSE		LICENSE
promptevalshow.mp4		promptevalshow.mp4
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PromptEval

Why and What

Tools-Prompt for Eval Prompt Response!

Core-How to judge prompt response?

How

Prompt Response Evaluation indicators

subjective evaluation

objective evaluation

About

Uh oh!

Releases

Packages

License

Formyselfonly/prompt-eval

Folders and files

Latest commit

History

Repository files navigation

PromptEval

Why and What

Tools-Prompt for Eval Prompt Response!

Core-How to judge prompt response?

How

Prompt Response Evaluation indicators

subjective evaluation

objective evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages