Dustin Dannenhauer's avatar
Dustin Dannenhauer
dustind@dtdannen.github.io
npub1mgvw...pdjc
Founder @ Delegance AI
Dustin Dannenhauer's avatar
Dustin 2 years ago
If a client chooses to make a new feature via DVM, then other clients could adopt the feature quickly. Consider recommending new people to follow on Nostr. Each client could implement their own version, or instead one person could make a DVM that other clients hit to get the data. DVMs may cost money but paying per each request to the DVM probably doesn’t always make sense. “Salaried” DVMs might be a solution. Client subscription fees could cover DVM fees. Free clients wouldn’t have the paid DVM features. Network effects would be insane here!
Dustin Dannenhauer's avatar
Dustin 2 years ago
It’s interesting how actions are chosen across different LLM agents. Judging from the description on OpenAI's website, the decision of which function (aka tool) to use is decided by the model (i.e. GPT4) based on the user’s incoming message and chat (thread) history. “In this example, we define a single function get_current_weather. The model calls the function multiple times, and after sending the function response back to the model, we let it decide the next step. It responded with a user-facing message which was telling the user the temperature in San Francisco, Tokyo, and Paris. Depending on the query, it may choose to call a function again. If you want to force the model to call a specific function you can do so by setting tool_choice with a specific function name. You can also force the model to generate a user-facing message by setting tool_choice: "none". Note that the default behavior (tool_choice: "auto") is for the model to decide on its own whether to call a function and if so which function to call.” See
Dustin Dannenhauer's avatar
Dustin 2 years ago
Happy new year #Nostr ! 2023 was incredible, can’t wait for 2024!
Dustin Dannenhauer's avatar
Dustin 2 years ago
Has anyone made a bot that pings you when someone has posted to a community you are moderating, so you can immediately review it? I started a community about ai-papers and missed a post from someone a few weeks ago. I didn’t get a notification in Nostur or Damus. If this doesn’t exist yet, I may try my hand at setting it up.
Dustin Dannenhauer's avatar
Dustin 2 years ago
There's no thrill quite like breaking production code
Dustin Dannenhauer's avatar
Dustin 2 years ago
I cherish a quiet day of coding
Dustin Dannenhauer's avatar
Dustin 2 years ago
Northern Virginia Nostr meetup was a hit! Looking forward to many more events, thanks to all who came! #NostrNovember
Dustin Dannenhauer's avatar
Dustin 2 years ago
Here's my review of the ChatDev paper, let me know what you think! Qian, C., Cong, X., Yang, C., Chen, W., Su, Y., Xu, J., ... & Sun, M. (2023). Communicative agents for software development. arXiv preprint arXiv:2307.07924. This paper presents a new approach to software development where many calls to LLMs in different roles (CEO, CTO, programmer, reviewer, etc) build an entire software project. The novelty of the paper seems to be the specific roles of the LLMs and the flow of calls between LLMs to design, write, review, and test code. I also liked that there were artistic agents that made assets to be used in the software (like player icons and button icons). My biggest issue with the paper is that it doesn’t formally define “thought instruction” or provide clear enough examples of it and the software projects it generated were small at only a few hundred lines of code. Experimentally, I’m not sure how well it generalizes because it is not clear if the evaluation dataset was used in the training of GPT 3.5.

Questions I had about this paper:

1. Are we sure the dataset for instruction-following (Camel[23]) is not in the training set of the LLM? If so, perhaps these results won’t generalize well to new software projects.
 Comments:
 1. Gpt 3.5 was used instead of 4, so maybe the results will be better when using Gpt 4 2. One of the key contributions seems to be the “thought instruction” mechanism, but there is no clear example of exactly what that is. On pages 6 and 7 it says “thought instruction includes swapping roles to inquire about which methods are not yet implemented and then switching back to the provide the programmer with more precise instructions to follow”. Is that all “thought instruction” is? I recommend a more formal or complete description of it in the paper. 3. Lines of source code for the projects was pretty small, with the max being 359 lines of code generated in one project. 4. The paper makes a big deal about the price of their approach being tiny ($0.2967) but given that the projects are so tiny, this is not necessarily cheaper than human developers for more realistic software systems if the cost does not scale linearly with the number of lines of code. 5. “Fortunately, with the thought instruction mechanism proposed in this paper, such bugs can often be easily resolved by importing the required class or method.” - this sentence is vague and doesn’t fully explain how “thought instruction” solves these types of bugs. 6. In the discussion section it says this approach to software development is “training-free” but the LLM has to be trained, so that’s not exactly fair to say. 7. I appreciated the examples in the appendix, however I wish the paper was clearer about exactly what “thought instruction” is, such as providing an example with and without thought instruction.