AI Performance Testing
    • 03 Mar 2025
    • 3 Minutes to read
    • Contributors
    • Dark
      Light
    • PDF

    AI Performance Testing

    • Dark
      Light
    • PDF

    Article summary

    Overview

    CommBox AI Autonomous Agent is designed to respond to customer inquiries with high accuracy, backed by measurable success metrics. This guide outlines the AI Agent's quality assurance mechanism and evaluation process.

    Note: Special Admin permissions are required.

    Evaluation System

    The Evaluation process works as follows:

    1. The Client’s Knowledge Base is uploaded and learned by the AI Agent.
    2. The Client then produces a spreadsheet with a minimum of 30 questions that covers the complete spectrum of information in that Knowledge base, with a column stating the best-expected answer.
    3. The spreadsheet is uploaded, and the AI is asked those questions. The AI answers are compared to the “expected answers” that were provided by the uploaded information.
    4. A report shows how well the AI answered the questions based on the guidelines and parameters we set for the evaluation.

    We evaluate each response using four key criteria, weighted by their impact on overall performance:

    1. Information Accuracy - Verifies response content against the client’s knowledge base. Responses must be 100% verified to pass our quality check.
    2. Answer Coverage - Measures how thoroughly the response addresses all key aspects of the inquiry compared to the complete answer available in the Knowledge base.
    3. Response Relevance - Ensures responses are focused and directly address the specific inquiry.
    4. Communication Quality - Evaluates each response's clarity, structure, grammar, spelling, and tone.

    We also evaluated “unknows” instances in which the AI did not answer the question at all. We ensure that repeating the same valuation process will produce better results as the AI is exposed to the same or similar inquiries.

    Test Performance

    Step I: Create an AI KB Item

    1. At the Automation Hub, select the Knowledge Base tab, click the + Add Item button at the top, and select AI PDF from the menu.
      new AI KB PDF item.png

    2. Enter a unique name for the AI item, drag and drop relevant documents, and click the Learn button.

    Step II: Create and Upload a Questionnaire

    1. Click here to download the Excel table format for your questionnaire.
      Required columns:
      a. Row ID
      b. Test Question – simulated question a customer may ask
      c. Expected Answer – Enter the correct answer you expect the customer to get based on the information found in the KB item.
      d. Explanation – Point to relevant sections from the knowledge base on which the expected answer is based. Mention key points that must be included in the full answer.

    2. At the Automation Hub, select the Knowledge Base tab, followed by the Test & Improve tab.

    3. At the top corner, click the New Test button.

    4. In the dialog box, upload the questionnaire file and click Start Test.
      upload test questionarre for AI.png

    Note:
    Depending on the file size, the test results may take about an hour to produce. Click the Email Notification icon and add your email address to receive an email when the results are ready.
    KB Testing.png

    Click here for a sample questionnaire.

    Step III: Test Results

    Once the test is complete, the test’s detailed information will be listed, including:

    • The Test’s name
    • The admin who requested the test
    • Starting date and time of the test
    • Status of test performance
    • The number of questions answered from the total asked
    • Final score for questions answered. Any score below an 85% success rate will appear in red.

    Click the Download document icon at the end of the summary information for the full report.

    About the report:

    The generated table includes the uploaded columns with the Agent Answer for each question, followed by the four evaluation sections mentioned above.
    The last two columns show the Score and Question Answered to see whether the answers are acceptable (Inaccurate Information or no answer will be an automatic failure).
    The summary section displays the accuracy rate of the answered question, followed by the complete statistical result that takes into account fully answered, partly answered, or not answered results.

    Click here for sample test results.

    Improve your knowledge base:

    To improve test results, you may take the following actions:

    • Verify the answers to your questions are in the knowledge base item uploaded. Questions not answered may indicate a lack of correlation between the questions and the answers provided.
    • Divide information into smaller subsections using numbered lists (e.g., 2a, 2b, etc.)

    Was this article helpful?