add analysis for PingCAP

2024-08-16 12:19:21 +08:00 · 2024-08-16 12:19:21 +08:00 · cbbd865b88
parent 0074e954dd
commit cbbd865b88
21 changed files with 1949 additions and 1 deletions
--- a/.DS_Store
+++ b/.DS_Store
--- a/9c1fd565-89c1-4503-a626-75218a9fe477.jpg
+++ b/9c1fd565-89c1-4503-a626-75218a9fe477.jpg
--- a/API.txt
+++ b/API.txt
@ -0,0 +1,713 @@
 [Image]
 获取API KEY地址：https://chatglm.cn/developersPanel/apiSet
 github入门代码：https://github.com/MetaGLM/glm-cookbook/blob/main/glms/api/glms_api.ipynb
 简介
 具备开发能力的创作者可以通过API调用的方式使用智能体，方便将智能体应用于更多业务场景中。API基本覆盖清言C端页面所有功能，包含文本对话，文生图，图片解读，联网搜索，文档解析，Python代码执行，外部API调用等。接入前请通过创作者中心获取api_key及api_secret. 当前仅支持通过API方式调用创作者自己创建的智能体。
 基础信息
 服务根路径： https://chatglm.cn/chatglm/assistant-api/v1/
 鉴权方式：header中发送 Authorization:  Bearer {access_token}. Token获取方式参考下方Access Token获取接口.
 请求体格式：JSON
 返回格式（除stream接口外）：JSON 
 返回结构体：
 字段
 类型
 必填
 说明
 status
 int
 Y
 是否成功，0->成功，其他代表失败，默认错误返回1. 具体错误码见接口说明。
 message
 string
 N
 错误信息
 result
 Dict
 N
 返回数据
 用量控制
 为避免API调用占用过多资源，系统限制每个开发者（非API_KEY）的并发数量。默认并发数为2。并发限制范围包括上传文件和会话调用API接口。
 此外，我们对每个开发者调用会话接口的次数也设置了上限，目前为一天500次，总量5000次。
 接口列表
 1. Access Token获取
 请求路径和方式：POST /get_token
 输入：
 字段
 类型
 必填
 说明
 api_key
 string
 Y
 开发者ID
 api_secret
 string
 Y
 开发者秘钥
 输出数据：
 字段
 类型
 必填
 说明
 access_token
 string
 Y
 接口鉴权token
 expires_in
 int
 Y
 token过期时间,单位秒。一次授权有效期为10天。请注意在token过期前更新。
 异常说明：
 异常情况
 http码
 status码
 api_key被禁用
 403
 1001
 api_key或api_secret验证失败
 401
 1002
 2. Assistant 会话调用
 请求路径和方式：POST /stream  
 输入：
 字段
 类型
 必填
 说明
 assistant_id
 string
 Y
 智能体 ID，浏览器打开智能体对话页后，可通过地址栏查看。下图红框中即为ID。
 [Image]
 conversation_id
 string
 N
 会话 ID，不传默认创建新会话。需要继续之前的会话时，传入流式输出Result中的conversation_id值。对话轮数和上下文长度无限制，需要注意上下文过大会导致响应耗时增加。
 prompt
 string
 Y
 prompt文本
 file_list
 List<Dict>
 N
 文件列表，传入文件上传接口返回的file_id。 [{"file_id":"xxx"}]
 meta_data
 Dict
 N
 元信息，拓展字段 
 输出：
 SSE 流式输出
 字段
 类型
 必填
 说明
 message
 Result
 Y
 SSE返回json格式消息体.
 输出Result结构
 字段
 类型
 必填
 说明
 history_id
 string
 Y
 历史记录ID
 conversation_id
 string
 Y
 对话上下文ID
 assistant_id
 string
 Y
 智能体ID
 message
 Message
 Y
 输出消息体, 每次输出一个message对象，前一个message结束后，输出下一个message
 created_at
 string
 Y
 创建时间，格式：“2023-11-23 18:00:00”
 status
 string 
 Y
 init->初始态, processing->生成中 finish->生成结束,error->生成异常。
 其中 finish和error为最终态，init和processing为中间态。网络正常的情况下，Result状态会从init到processing，再到finish或error。
 出现输入风控拦截时，状态从init直接进入error。
 last_error
 dict
 N
 异常信息 {"error_code":"", ""error_msg":""} 
 具体报错信息见下方【流式异常说明】
 meta_data
 dict
 N
 {}
 输出Message结构
 字段
 类型
 必填
 说明
 role
 string
 Y
 assistant为模型输出，tool为工具输出
 content
 Content
 Y
 输出内容，参考下方Content结构。
 status
 string
 Y
 init/processing/finish/error，状态流转逻辑与外层Result相同。
 created_at
 string
 Y
 时间戳。格式参考：“2023-11-23 18:00:00”
 meta_data
 dict
 N
 {}
 输出Content结构
 根据不同的Tool调用情况，返回的content结构不同，参考如下：
 Tool使用情况
 返回结构
 无工具，文本输出
 {
    "type": "text",
    "text": "你好"
 }
 CogView画图
 {
    "type":"image", 
    "image":[
        {"image_url":"https://sfile.chatglm.cn/..."}
    ]
 }
 Python代码执行
 代码执行-开始
 {
  "type": "code",
  "code": "# Calculating the square of 10\n10 ** 2"
 }
 代码执行-成功，结果返回
 {
  "type": "execution_output",
  "content": "100"
 }
 代码执行-失败
 {
  "type": "system_error",
  "content": "sandbox error code: 504"
 }
 OPEN-URL链接打开
 OPEN-URL开始
 {
    "type":"tool_calls",
    "tool_calls":{
        "name":"browser",
        "arguments":"open_url(\"https://www.chatglm.com/\")"}
    }
 }
 OPEN-URL结束-成功
 {
    "type": "browser_result",
    "content": "{\"title\": \" Split string with multiple delimiters in Python - Stack Overflow\", \"url\": \" https://stackoverflow.com/questions/4998629/split-string-with-multiple-delimiters-in-python\"}",
 }
 OPEN-URL结束-失败
 {
    "type":"system_error",
    "content":"Error when executing command "}],
    "meta_data":{
        "failedCommand":"open_url(\"https://www.chatglm.com/\")"
        }
    }
 }
 Search联网搜索
 Search 开始
 {
    "type":"tool_calls",
    "tool_calls":{
        "name":"browser",
        "arguments":"search(\"河南河北山西霰雪预警 2024\", recency_days=30)"}
    }
 }
 Search 结束-成功
 {
    "type": "browser_result",
    "content": "{}",
    "meta_data":{
      {
        "metadata_list":[
          {
            "type":"webpage",
            "title":"2024年1-2月份山西省进出口情况分析 - 昆明海关",
            "url":"http://kunming.customs.gov.cn/taiyuan_customs/zfxxgk50/3023433/3023453/4676174/5755479/index.html","text":"比去年同期（下同）增长26.8%。全省活跃企业数明显增加前2个月。占全省进出口总值的40.3%。国家",
            "pub_date":""
          },
        ]
      }
    }
 }
 Search 结束-失败
 {
    "type":"system_error",
    "content":"Error when executing command "}],
    "meta_data":{
        "failedCommand":"search(\"河南河北山西霰雪预警 2024\", recency_days=30)"
        }
    }
 }
 mclick 搜索结果页面打开 开始. mclick中参数为打开链接序号列表，关联搜索结果
 {
  "type": "tool_calls",
  "tool_calls": {
    "name": "browser",
    "arguments": "mclick([0, 1, 3])"
  },
  "meta_data":{
      "metadata_list":[
          {
            "type":"webpage",
            "title":"2024年1-2月份山西省进出口情况分析 - 昆明海关",
            "url":"http://kunming.customs.gov.cn/taiyuan_customs/zfxxgk50/3023433/3023453/4676174/5755479/index.html","text":"比去年同期（下同）增长26.8%。全省活跃企业数明显增加前2个月。占全省进出口总值的40.3%。国家",
            "pub_date":""
          },
      ]
  }
 }
 mclick 搜索结果页面打开 成功. 所有链接均完成后，返回quote_result.
 {
    "type": "quote_result",
    "content": "{}",
    "meta_data":{
        "metadata_list":[
          {
            "type":"webpage",
            "title":"2024年1-2月份山西省进出口情况分析 - 昆明海关",
            "url":"http://kunming.customs.gov.cn/taiyuan_customs/zfxxgk50/3023433/3023453/4676174/5755479/index.html","text":"比去年同期（下同）增长26.8%。全省活跃企业数明显增加前2个月。占全省进出口总值的40.3%。国家",
            "pub_date":""
          },
        ]
    }
 }
 mclick 搜索结果页面打开 失败. 每个URL失败返回一次该事件，可能会出现部分失败.
 {
  "type": "system_error",
  "content": ""，
  "meta_data":{
      "failedURL": "https://xxx.xx"
  }
 }
 外部API调用
 请求API的参数内容
 {
    "type": "tool_calls",
    "tool_calls": {
        "name": "generate",
        "arguments": "```python\ntool_call(Content-Type='application/json', title='Hello World', type='post', platform='wordpress')\n```",
        "host": "ppt-test-v101.wxbjq.top"
    }
 }
 请求API的返回内容
 {
    "type": "function_result",
    "content": "{\"status\":0,\"message\":\"success\",\"result\":{\"count\":818},\"rid\":\"2030521d-b4ea-4c38-854f-f3f189a3ebb3\"}"
 }
 知识库检索（需在智能体配置中打开“显示相关的知识库段落”）
 检索开始
 {
    "type": "rag_slices",
    "content": []
 }
 检索结果
 {
    "type": "rag_slices",
    "content": [
        {"text": "知识库内容abcd", "document_name": "1.pdf"}
    ]
 }
 异常说明：
 异常情况
 http码
 status码
 api_key被禁用
 403
 10004
 api_key已删除
 400
 10003
 FileList中文件不存在
 400
 10005
 当日会话调用次数超限
 403
 10008
 实时并发数超限
 403
 10007
 智能体被删除
 403
 10010
 无权限调用智能体
 403
 10018
 流式异常说明：
 异常情况
 error_code码
 安全风控拦截
 10031
 生成中断（网络原因）
 10024
 生成错误（模型原因）
 10027
 生成异常（系统原因）
 10025
 生成图片失败
 10028
 3. Assistant 会话调用（非流式输出）
 请求路径和方式：POST /stream_sync  
 输入：
 字段
 类型
 必填
 说明
 assistant_id
 string
 Y
 智能体 ID，浏览器打开智能体对话页后，可通过地址栏查看。下图红框中即为ID。
 [Image]
 conversation_id
 string
 N
 会话 ID，不传默认创建新会话。需要继续之前的会话时，传入流式输出Result中的conversation_id值。对话轮数和上下文长度无限制，需要注意上下文过大会导致响应耗时增加。
 prompt
 string
 Y
 prompt文本
 file_list
 List<Dict>
 N
 文件列表，传入文件上传接口返回的file_id。 [{"file_id":"xxx"}]
 meta_data
 Dict
 N
 元信息，拓展字段 
 输出：
 HTTP response JSON输出
 https://zhipu-ai.feishu.cn/sync/Bz6sdk1QbsIbnjbX3nBcLOSMn8b
 输出Result结构
 字段
 类型
 必填
 说明
 history_id
 string
 Y
 历史记录ID
 conversation_id
 string
 Y
 对话上下文ID
 output
 List<Part>
 Y
 输出消息体列表，每个Part对应流式中的一个Message
 status
 string 
 Y
 init->初始态, processing->生成中 finish->生成结束,error->生成异常。
 其中 finish和error为最终态，init和processing为中间态。网络正常的情况下，Result状态会从init到processing，再到finish或error。
 出现输入风控拦截时，状态从init直接进入error。
 输出Part结构
 字段
 类型
 必填
 说明
 role
 string
 Y
 assistant为模型输出，tool为工具输出
 content
 Content
 Y
 输出内容，参考下方Content结构。
 status
 string
 Y
 init/processing/finish/error，状态流转逻辑与外层Result相同。
 created_at
 string
 Y
 时间戳。格式参考：“2023-11-23 18:00:00”
 meta_data
 dict
 N
 {}
 输出Content结构
 根据不同的Tool调用情况，返回的content结构不同，参考如下：
 Tool使用情况
 返回结构
 无工具，文本输出
 {
    "type": "text",
    "text": "你好"
 }
 CogView画图
 {
    "type":"image", 
    "image":[
        {"image_url":"https://sfile.chatglm.cn/..."}
    ]
 }
 Python代码执行
 代码执行-开始
 {
  "type": "code",
  "code": "# Calculating the square of 10\n10 ** 2"
 }
 代码执行-成功，结果返回
 {
  "type": "execution_output",
  "content": "100"
 }
 代码执行-失败
 {
  "type": "system_error",
  "content": "sandbox error code: 504"
 }
 OPEN-URL链接打开
 OPEN-URL开始
 {
    "type":"tool_calls",
    "tool_calls":{
        "name":"browser",
        "arguments":"open_url(\"https://www.chatglm.com/\")"}
    }
 }
 OPEN-URL结束-成功
 {
    "type": "browser_result",
    "content": "{\"title\": \" Split string with multiple delimiters in Python - Stack Overflow\", \"url\": \" https://stackoverflow.com/questions/4998629/split-string-with-multiple-delimiters-in-python\"}",
 }
 OPEN-URL结束-失败
 {
    "type":"system_error",
    "content":"Error when executing command "}],
    "meta_data":{
        "failedCommand":"open_url(\"https://www.chatglm.com/\")"
        }
    }
 }
 Search联网搜索
 Search 开始
 {
    "type":"tool_calls",
    "tool_calls":{
        "name":"search",
        "arguments":"search(\"河南河北山西霰雪预警 2024\", recency_days=30)"}
    }
 }
 Search 结束-成功
 {
    "type": "browser_result",
    "content": "{}",
    "meta_data":{
      {
        "metadata_list":[
          {
            "type":"webpage",
            "title":"2024年1-2月份山西省进出口情况分析 - 昆明海关",
            "url":"http://kunming.customs.gov.cn/taiyuan_customs/zfxxgk50/3023433/3023453/4676174/5755479/index.html","text":"比去年同期（下同）增长26.8%。全省活跃企业数明显增加前2个月。占全省进出口总值的40.3%。国家",
            "pub_date":""
          },
        ]
      }
    }
 }
 Search 结束-失败
 {
    "type":"system_error",
    "content":"Error when executing command "}],
    "meta_data":{
        "failedCommand":"search(\"河南河北山西霰雪预警 2024\", recency_days=30)"
        }
    }
 }
 mclick 搜索结果页面打开 开始. mclick中参数为打开链接序号列表，关联搜索结果
 {
  "type": "tool_calls",
  "tool_calls": {
    "name": "browser",
    "arguments": "mclick([0, 1, 3])"
  },
  "meta_data":{
      "metadata_list":[
          {
            "type":"webpage",
            "title":"2024年1-2月份山西省进出口情况分析 - 昆明海关",
            "url":"http://kunming.customs.gov.cn/taiyuan_customs/zfxxgk50/3023433/3023453/4676174/5755479/index.html","text":"比去年同期（下同）增长26.8%。全省活跃企业数明显增加前2个月。占全省进出口总值的40.3%。国家",
            "pub_date":""
          },
      ]
  }
 }
 mclick 搜索结果页面打开 成功. 所有链接均完成后，返回quote_result.
 {
    "type": "quote_result",
    "content": "{}",
    "meta_data":{
        "metadata_list":[
          {
            "type":"webpage",
            "title":"2024年1-2月份山西省进出口情况分析 - 昆明海关",
            "url":"http://kunming.customs.gov.cn/taiyuan_customs/zfxxgk50/3023433/3023453/4676174/5755479/index.html","text":"比去年同期（下同）增长26.8%。全省活跃企业数明显增加前2个月。占全省进出口总值的40.3%。国家",
            "pub_date":""
          },
        ]
    }
 }
 mclick 搜索结果页面打开 失败. 每个URL失败返回一次该事件，可能会出现部分失败.
 {
  "type": "system_error",
  "content": ""，
  "meta_data":{
      "failedURL": "https://xxx.xx"
  }
 }
 外部API调用
 请求API的参数内容
 {
    "type": "tool_calls",
    "tool_calls": {
        "name": "generate",
        "arguments": "```python\ntool_call(Content-Type='application/json', title='Hello World', type='post', platform='wordpress')\n```",
        "host": "ppt-test-v101.wxbjq.top"
    }
 }
 请求API的返回内容
 {
    "type": "function_result",
    "content": "{\"status\":0,\"message\":\"success\",\"result\":{\"count\":818},\"rid\":\"2030521d-b4ea-4c38-854f-f3f189a3ebb3\"}"
 }
 知识库检索（需在智能体配置中打开“显示相关的知识库段落”）
 检索开始
 {
    "type": "rag_slices",
    "content": []
 }
 检索结果
 {
    "type": "rag_slices",
    "content": [
        {"text": "知识库内容abcd", "document_name": "1.pdf"}
    ]
 }
 异常说明：
 异常情况
 http码
 status码
 api_key被禁用
 403
 10004
 api_key已删除
 400
 10003
 FileList中文件不存在
 400
 10005
 当日会话调用次数超限
 403
 10008
 实时并发数超限
 403
 10007
 智能体被删除
 403
 10010
 无权限调用智能体
 403
 10018
 4. Assistant 对话文件上传
 上传文件，输入为文件流，输出file_id用于会话调用时使用。文件大小限制为100MB。目前支持的业务场景与文件类型对应参考下表：
 业务场景
 文件类型
 图片解析
 'jpg', 'png', 'jpeg', 'webp', 'gif'
 文档解读
 'pdf', 'docx', 'doc', 'txt', 'pptx', 'md', 'epub', 'epub.zip'
 代码执行（Code Interpreter）
 上传其他所有类型文件，或在图片与文档文件混合使用时。
 请求路径和方式：POST form-data /file_upload 
 输入：
 字段
 类型
 必填
 说明
 file
 file
 Y
 文件流。仅支持上传单一文件。
 输出：
 字段
 类型
 必填
 说明
 file_id
 string
 Y
 文件ID
 file_name
 String 
 Y
 文件名
 异常说明：
 异常情况
 http码
 status码
 api_key被禁用
 403
 10004
 api_key已删除
 400
 10003
 实时并发数超限
 403
 10006
 文件上传失败
 500
 11003
 文档解析失败
 500
 11004
 文档字数超限（上限12万Token）
 400
 11005
 文件大小超限
 400
 11006
 文件处理失败
 500
 11009
--- a/AgentProxy.py
+++ b/AgentProxy.py
@ -0,0 +1,94 @@
 import json
 import requests
 from ExcelHelper import ExcelHelper
 class AgentProxy:
    def __init__(self, assistant_id, api_key, api_secret):
        self.assistant_id = assistant_id
        self.api_key = api_key
        self.api_secret = api_secret
        self.access_token = self.get_access_token(api_key, api_secret)
    def get_access_token(self, api_key, api_secret):
        url = "https://chatglm.cn/chatglm/assistant-api/v1/get_token"
        data = {
            "api_key": api_key,
            "api_secret": api_secret
        }
        response = requests.post(url, json=data)
        token_info = response.json()
        return token_info['result']['access_token']
    def handle_response(self, data_dict):
        message = data_dict.get("message")
        if len(message) > 0:
            content = message.get("content")
            if len(content) > 0:
                response_type = content.get("type")
                if response_type == "text":
                    text = content.get("text", "No text provided")
                    return f"{text}"
                elif response_type == "image":
                    images = content.get("image", [])
                    image_urls = ", ".join(image.get("image_url") for image in images)
                    return f"{image_urls}"
                elif response_type == "code":
                    return f"{content.get('code')}"
                elif response_type == "execution_output":
                    return f"{content.get('content')}"
                elif response_type == "system_error":
                    return f"{content.get('content')}"
                elif response_type == "tool_calls":
                    return f"{data_dict['tool_calls']}"
                elif response_type == "browser_result":
                    content = json.loads(content.get("content", "{}"))
                    return f"Browser Result - Title: {content.get('title')} URL: {content.get('url')}"
    def send_message(self,  prompt, conversation_id=None, file_list=None, meta_data=None):
        url = "https://chatglm.cn/chatglm/assistant-api/v1/stream"
        headers = {
            "Authorization": f"Bearer {self.access_token}",
            "Content-Type": "application/json"
        }
        data = {
            "assistant_id": self.assistant_id,
            "prompt": prompt,
        }
        if conversation_id:
            data["conversation_id"] = conversation_id
        if file_list:
            data["file_list"] = file_list
        if meta_data:
            data["meta_data"] = meta_data
        with requests.post(url, json=data, headers=headers) as response:
            if response.status_code == 200:
                for line in response.iter_lines():
                    if line:
                        decoded_line = line.decode('utf-8')
                        if decoded_line.startswith('data:'):
                            data_dict = json.loads(decoded_line[5:])
                            output = self.handle_response(data_dict)
                            print(output)
            else:
                return "Request failed", response.status_code
        return output
 # # Here you need to replace the API Key and API Secret with your，I provide a test key and secret here
 # api_key = '25bda2c39c0f8ca0' 
 # api_secret = 'e0008b9b9727cb8ceea5a132dbe62495' 
 # assistant_id = "66bb09a84673b57506fe7bbd" 
 # agent = AgentProxy(assistant_id, api_key, api_secret)
 # agent.send_message("你好")
--- a/ExcelHelper.py
+++ b/ExcelHelper.py
@ -29,6 +29,38 @@ class ExcelHelper:
        self.workbook.save(filename)
        print(f"Excel 文件已保存为 {filename}")
    def extract_columns(self, filename, columns_to_extract, new_filename):
        """
        Extracts specified columns from an existing Excel file and saves them to a new file.
        :param filename: The source Excel file name.
        :param columns_to_extract: A list of column indices (1-based) to extract.
        :param new_filename: The name of the new Excel file to save the extracted columns.
        """
        # Load the source workbook
        source_wb = openpyxl.load_workbook(filename)
        source_sheet = source_wb.active
        # Create a new workbook and sheet
        new_wb = openpyxl.Workbook()
        new_sheet = new_wb.active
        # Copy headers
        for col_idx in columns_to_extract:
            new_sheet.cell(row=1, column=columns_to_extract.index(col_idx) + 1, value=source_sheet.cell(row=1, column=col_idx).value)
        # Copy data
        for row_idx, row in enumerate(source_sheet.iter_rows(min_row=2), start=2):
            for col_idx in columns_to_extract:
                new_sheet.cell(row=row_idx, column=columns_to_extract.index(col_idx) + 1, value=source_sheet.cell(row=row_idx, column=col_idx).value)
        # Save the new workbook
        new_wb.save(new_filename)
        print(f"Extracted columns saved to {new_filename}")
 # # 示例数据
 # data = [
 #     [
--- a/WechatIMG8412.png
+++ b/WechatIMG8412.png
--- a/WechatIMG8418.png
+++ b/WechatIMG8418.png
--- a/analysis.py
+++ b/analysis.py
@ -0,0 +1,15 @@
 import pandas as pd
 # 读取Excel文件
 df = pd.read_excel('pingcap_pipeline.xlsx')
 # 按照"客户分类"列分组，并计算ACV列的和
 acv_name = '预估 ACV'
 grouped_df = df.groupby('负责人所属行业')[acv_name].sum().astype(int).reset_index()
 grouped_df = grouped_df.sort_values(by=acv_name, ascending=False)
 grouped_df[acv_name] = grouped_df[acv_name].apply(lambda x: '{:,}'.format(x))
 # 打印结果
 print(grouped_df)
--- a/analysis_pipeline.py
+++ b/analysis_pipeline.py
@ -0,0 +1,62 @@
 import pandas as pd
 from AgentProxy import AgentProxy
 # Here you need to replace the API Key and API Secret with your，I provide a test key and secret here
 api_key = '25bda2c39c0f8ca0' 
 api_secret = 'e0008b9b9727cb8ceea5a132dbe62495' 
 assistant_id = "66bb09a84673b57506fe7bbd" 
 agent = AgentProxy(assistant_id, api_key, api_secret)
 # Prospecting
 # Evaluation
 # Qualification
 # Bidding / Negotiating
 # Contract Review
 # Closed Won
 # Cancel
 # Closed Lost
 sales_stages = ["Prospecting", "Evaluation", "Qualification", "Bidding / Negotiating", "Contract Review", "Closed Won", "Cancel", "Closed Lost"]
 prompt = "某公司销售阶段分为如下几个定义，你能告诉我什么信息 "+ str(sales_stages)
 print(prompt)
 # print(agent.send_message(prompt))
 sales_stage_definition = '''
 1. **Prospecting（潜在客户开发）**：这一阶段涉及识别和开发潜在客户。销售人员通过各种渠道（如电话、电子邮件、社交媒体等）寻找潜在买家。
 2. **Evaluation（评估）**：在评估阶段，销售团队会评估潜在客户的需求，确定他们是否与公司的产品或服务相匹配。同时，潜在客户也在评估不同的供应商。
 3. **Qualification（资格认定）**：这一阶段的目标是确定潜在客户是否具有成为合格销售机会的潜力。这通常涉及对客户的预算、需求、决策过程和时间线等进行评估。
 4. **Bidding / Negotiating（投标/谈判）**：在这个阶段，销售人员会向客户提交正式的报价或提案，并进行必要的谈判，以达成最终的销售协议。
 5. **Contract Review（合同审查）**：一旦谈判完成，双方将审查合同条款，确保所有细节都得到妥善处理，并准备好签署。
 6. **Closed Won（成功关闭）**：这是销售流程的最终目标，表示交易已经成功完成，客户已经购买了产品或服务。
 7. **Cancel（取消）**：在某些情况下，交易可能会在过程的任何阶段取消。这可能是因为客户改变了主意，或者发现产品或服务不再符合他们的需求。
 8. **Closed Lost（失败关闭）**：如果销售机会没有成功，它将被标记为“失败关闭”。这可能是因为竞争、价格问题或客户需求的改变等原因。'''
 # Read the Excel file
 df = pd.read_excel('output_top20.xlsx')
 # Iterate over each row in the DataFrame
 for index, row in df.iterrows():
    # Extract the information from the column "当前详细状态及Close节奏"
    try:
        detailed_status = row['当前详细状态及Close节奏']
        print(f"Processing row {index}")
        detailed_current_stage = row['Sales stage']
        prompt = f"某公司当前销售定义为 {sales_stage_definition}, 当前销售阶段为 {detailed_current_stage}, 销售人员填写的销售动作日志为: {detailed_status} , 请分析当前销售阶段以及销售动作日志，判断其销售动作是否支持将销售阶段转化到当前阶段{detailed_current_stage}，给出判断结果及销售阶段分析报告"
        analysis_result = agent.send_message(prompt)
        print(analysis_result)
        df.at[index, '分析结果'] = analysis_result  # Directly update the DataFrame
    except Exception as e:
        print(f"Error processing row {index}: {e}")
        df.at[index, '分析结果'] = f"Error: {e}"  # Log the error in the DataFrame
 df.to_excel('analysis_result.xlsx', index=False)
--- a/analysis_result.txt
+++ b/analysis_result.txt
@ -0,0 +1,90 @@
 1. sub-industry 
 银行: 175,888,600
 保险: 39,237,140
 证券基金: 24,932,850
 运营商: 19,638,040
 互联网+: 16,770,820
 制造/汽车: 15,018,440
 零售: 14,976,950
 公共事业: 12,798,200
 物流/交通: 10,177,670
 互金: 8,386,816
 能源电力: 3,905,335
 其他: 3,853,000
 媒体/文娱: 1,019,133
 教育/科研: 792,000
 ISV: 46,400
 2. industry
 金融: 241,891,400.00
 新经济: 62,392,410.00
 其他: 31,256,490.00
 其他: 公共事业: 1,012,747.00
 其他: 政府: 683,402.30
 其他: 医疗: 395,475.20
 其他: 运营商: 351,900.00
 其他: 公共事业部: 283,200.00
 pipeline:
 36          银行  435,192,472
 34        证券基金   96,413,873
 5         公共事业   76,163,399
 4           保险   73,039,728
 33        能源电力   65,098,557
 28       物流/交通   54,499,388
 24       制造/汽车   45,458,488
 35         运营商   41,493,666
 37          零售   35,260,799
 2         互联网+   33,295,017
 6           其他   25,413,146
 3           互金   24,768,317
 1          ISV    8,490,800
 16     其他: 国央企    4,866,700
 25       媒体/文娱    3,066,800
 31          石油    2,540,000
 15      其他: 国企    2,402,000
 21    其他: 资产管理    1,000,000
 17      其他: 政府      990,000
 18      其他: 服务      980,000
 14      其他: 医疗      700,000
 7     其他:  云服务      610,000
 9   其他:  央企招商局      500,000
 12      其他: 农业      450,000
 22    其他: 金融租赁      300,000
 10     其他:  教育      300,000
 20    其他: 融资担保      200,000
 23     其他: 高科技      153,891
 8    其他:  公共卫生      150,000
 13    其他: 农林牧渔      100,000
 27       教育/科研            0
 29    物流/交通/出行            0
 30  电信/网络/云服务商            0
 26    媒体/视频/文娱            0
 32  社交/门户/在线服务            0
 19  其他: 消费金融公司            0
 11     其他:  游戏            0
 38   零售/电商/消费品            0
  负责人所属行业       预估 ACV
 0     中小行  363,526,931
 3     新经济  229,650,249
 1      大行  179,490,830
 6      证券   98,536,868
 2      政府   52,675,168
 5      能源   50,749,255
 7     运营商   34,007,166
 4      渠道    8,706,121
 销售分布情况：
 高价值单子（ACV500万以上）：虽然金额占比19%，但单子个数仅占1.1%，这表明大部分销售额来自于少数几个大客户。这种集中度可能带来风险，因为如果这些大客户流失，将对总体销售额产生重大影响。
 中等价值单子（ACV500-100万）：金额和单子个数占比分别为47.2%和16%，这部分是销售的主力军，显示了较好的分散性。
 低价值单子（ACV100-50万和50万以下）：虽然单子个数占比较大（分别为15.3%和67.6%），但金额占比相对较小（分别为15.4%和18.4%）。这表明许多销售活动集中在低价值单子上，可能影响整体销售额的提升。
 销售任务完成可能性：
 每位销售人员的任务ACV为500-600万。根据目前的数据分布，如果高价值单子能够保持稳定，同时中等价值单子数量和质量有所提升，完成任务的可能性是存在的。
 但是，如果过度依赖少数几个大客户，或者低价值单子占比过高，可能会影响整体销售任务的完成。
 潜在问题：
 客户集中度：高价值单子的客户可能过于集中，需要分散风险。
 销售策略：可能需要调整销售策略，增加中等价值单子的数量和质量，以提高销售额。
 资源分配：销售人员可能需要更有效地分配资源，更多地关注潜在的高价值客户，同时维持和提升中等价值客户的关系。
 综上所述，虽然存在一些问题和挑战，但通过适当的策略调整和资源优化，完成销售任务是有希望的。建议重点关注客户多样性和销售策略的优化。
--- a/analysis_result.xlsx
+++ b/analysis_result.xlsx
--- a/assistant.py
+++ b/assistant.py
@ -0,0 +1,52 @@
 from authentication import get_access_token
 import requests
 def upload_file(file_path, token, assistant_id):
    url = f"{base_url}/file_upload"
    # Check if the file exists
    try:
        file = open(file_path, 'rb')
    except FileNotFoundError:
        return {"error": "File not found", "file_path": file_path}
    try:
        # Dynamically get the file name from the file path
        file_name = file_path.split('/')[-1]
        files = {'file': (file_name, file)}
        headers = {
            "Authorization": f"Bearer {token}"
        }
        data = {
            "assistant_id": assistant_id
        }
        print("Request Headers:", headers)
        print("Request Data:", data)
        print("File Path:", file_path)
        response = requests.post(url, files=files, headers=headers)
        print("Response Status Code:", response.status_code)
        print("Response JSON:", response.json())
        if response.status_code == 200:
            return response.json()
        else:
            return {"error": "File upload failed", "status_code": response.status_code, "response": response.json()}
    except requests.RequestException as e:
        return {"error": "Request failed", "details": str(e)}
    finally:
        file.close()
 api_key = '25bda2c39c0f8ca0' 
 api_secret = 'e0008b9b9727cb8ceea5a132dbe62495' 
 token = get_access_token(api_key, api_secret)
 print(token)
 assistant_id = "66b46e7d1c146ed6a5220d7a" 
 base_url = "https://chatglm.cn/chatglm/assistant-api/v1"
 upload_file_path = "/Users/tigeren/Dev/digisky/market_assistant/pingcap.xlsx"
 upload_file_response = upload_file(upload_file_path, token, assistant_id)
 print(upload_file_response)
--- a/authentication.py
+++ b/authentication.py
@ -0,0 +1,13 @@
 import requests
 def get_access_token(api_key, api_secret):
    url = "https://chatglm.cn/chatglm/assistant-api/v1/get_token"
    data = {
        "api_key": api_key,
        "api_secret": api_secret
    }
    response = requests.post(url, json=data)
    token_info = response.json()
    print(token_info)
    return token_info['result']['access_token']
--- a/filter_pipeline.py
+++ b/filter_pipeline.py
@ -0,0 +1,17 @@
 import pandas as pd
 # 读取Excel文件
 df = pd.read_excel('pingcap_pipeline.xlsx')
 # 计算“当前详细状态及Close节奏”列的字数
 df['字数'] = df['当前详细状态及Close节奏'].apply(lambda x: len(str(x)))
 # 按字数排序并取TOP20
 top20_df = df.sort_values(by='字数', ascending=False).head(20)
 # 删除添加的字数列
 top20_df = top20_df.drop(columns=['字数'])
 # 输出到新的Excel文件
 top20_df.to_excel('output_top20.xlsx', index=False)
--- a/iShot_2024-08-12_07.47.51.png
+++ b/iShot_2024-08-12_07.47.51.png
--- a/output_top20.xlsx
+++ b/output_top20.xlsx
--- a/pingcap.csv
+++ b/pingcap.csv
--- a/pingcap.xlsx
+++ b/pingcap.xlsx
--- a/pingcap_full_data.xlsx
+++ b/pingcap_full_data.xlsx
--- a/pingcap_pipeline.xlsx
+++ b/pingcap_pipeline.xlsx
--- a/promptAgent.py
+++ b/promptAgent.py
@ -121,4 +121,4 @@ for index in range(0, 4):
 # 创建 ExcelHelper 实例并生成 Excel 文件
 excel_helper = ExcelHelper(excelData)
-excel_helper.create_excel('output.xlsx')
+excel_helper.create_excel('output.xlsx')