Awesome Large Language Model Questions and Answers by the FinGPT Community
2023–10–25: Questions presented to a Senior Scientist at OpenAI during a conference
Q1: Large Model Hyperparameter Tuning Strategy: How can you efficiently search and adjust hyperparameters for large models given their extensive hyperparameter space? Are there best practices or things to be cautious of?
Answer: There’s no special method for hyperparameter tuning. It mostly relies on personal experience.
Q2: Interpreting Large Models: Large models are often seen as “black boxes.” Does OpenAI have effective methods or tools to enhance the interpretability of these large models?
Answer: Currently, there’s no method to increase the interpretability of large models. It’s kind of a False proposition
Q3: Hallucination in Large Models: When developing large models, how can we identify and guard against “illusions/hallucinations” — i.e., outputs that seem logical but are misleading or false? Does OpenAI have systematic methods or standard procedures to assess and mitigate these risks?
Answer: To reduce Hallucination in large models, improvements can be made at the data layer, and memorization training can be done during the fine-tuning phase.
Q4: Customized vs. Universal Large Models: Will specialized large models become obsolete if universal models, like future GPT-5, GPT-6, or GPT-7, outperform in specific domains (e.g., finance)?
Answer: OpenAI has no plans for specialized models, but specialized models are promising, especially in industries with private or unique data like finance.
Q5: Large Model Inference Quantization: How to improve inference results without compromising quality?
Answer: For model weight quantization specifics, refer to the GPTQ paper.
Q6: Model Quantization: Why does int4 quantization reduce model size significantly but has lower inference efficiency than int8? Are there any solutions?
Answer: There are two aspects of model quantization. First, using low-precision hardware for computation and second, reducing the memory bandwidth required for model throughput. If the int4 performance on a chip is not as good as int8, consider storing the model in int4 and converting to int8 for computations.
Q7: Open-Sourcing Inference Acceleration Frameworks: Are there future plans to open source the inference acceleration framework related to ChatGPT?
Answer: It’s believed that OpenAI will not open-source the model inference acceleration framework.
Q8: Controlling Large Model Input Parameters: How to effectively control the input configurations of large models and evaluate the generated results?
Answer: It relies on personal experience.
Q9: Safety of Large Model Responses: How can the safety of responses from large models be ensured? And how can online inference latency be reduced?
Answer: There’s no specific method to improve the safety of large model responses. The solution is to feed the model with targeted data collection.
Q10 Future of Large Model Training: Will there be new distributed training methods beyond tensor parallelism and 3D parallelism?
Answer: The future direction of distributed training seems to be automation, replacing manual strategies with automated software that allocates resources based on cluster configurations.
Q11: Handling Failures during Training: How to address single point failures during training and quickly take over tasks from failed nodes?
Answer: Restart and find a machine without faults.
Q12 Vector Databases: Is there a real demand for vector databases? Some like AutoGPT seem to be moving away from them.
Answer: Vector databases might be needed in specialized models or enterprise-scale models.
Q13: Crowdsourcing Feedback for Open-source Large Language Models: Would OpenAI be willing to create a platform for this?
Answer: No plans at the moment.
Q14: Using RAG vs. Finetuning in llm: When to use Retrieval-Augmented Generation and when to fine-tune?
Answer: There’s no fixed strategy. They aren’t mutually exclusive. For instance, fine-tuning specialized models can enhance results, but it doesn’t mean RAG isn’t needed.
Q15: Evaluation Methods at OpenAI: What are the evaluation methods used at OpenAI for their models?
Answer: There are general methods as well as proprietary ones which are not disclosed. but there are Open-source LLMs leaderboards
Q16: Future of Autonomous Systems like AutoGPT: What’s the trend?
Answer: Autonomous agents have a promising future, but it might take some time.
Q17: Large Models Emulating Celebrities: How to train a model to emulate a celebrity’s experiences and statements?
Answer: Digital avatars of celebrities can be made. Pre-training isn’t required separately. The model can be trained during the fine-tuning phase using the celebrity’s data.
Q18: 2024 Predictions: If everyone is training large models in 2023, does it mean that inference will be the focus in 2024?
Answer: Every year, different people or companies have different focuses. In 2023, both training and inference are being worked on.
Q19: Breakthroughs in 2024 for Large Models: What might they be?
Answer: Multimodal approaches, especially during the pre-training phase.
Q20: Opinion on AMD’s MI300 Series Graphics Card: How do they perform?
Answer: The new AMD MI300 card is believed to achieve 60–70% of the performance of an equivalent NVIDIA card. For large model developers, using NVIDIA’s GPUs means they can run models directly, but using AMD’s GPU might require months of adaptation.