Published on
4 min read

Creating a text feature for an existing web application using Large Language Models (LLMs)

Authors
  • avatar
    Name
    John Moscarillo
    Twitter

Creating a text feature for an existing web application using Large Language Models (LLMs): Weather Summary, News, and Location Guide Text Generator

I’ve been working with Large Language Models (LLMs) for the past year. I’ve built a couple of chatbots and, most recently, have added a text generation feature to my personal weather application.

Large Language Models (LLMs) are advanced AI models capable of understanding and generating human-like text. They have transformed the landscape of natural language processing (NLP) with applications ranging from chatbots and content creation to code generation and beyond. Models like GPT-4, developed by OpenAI, exemplify this.

I’ve ended up with four LLMs and, over time, have narrowed down specific LLMs for the best outcomes, as some perform better depending on the given task. My initial intention was to use all four and randomly cycle them, assuming they would all work essentially similarly. However, given the range of text content I want to display, I cannot use all LLMs for all four features: Today, Forecast, News, and Guide.

Choosing the LLMs

I chose the four LLMs as they are the most popular general AI models to get access to as a developer and are free or close to free to use. Using four decreases the cost by distributing the load and keeping all below the cost thresholds. I did realize early on the “latest and greatest” models will cost you quite a bit of money and using them in a free application is not feasible. So, I am forced to use the lowest levels for the four LLMs offered. I have 20-30 users a day and am currently paying about $5-10/month and most of this cost is from me playing around at various times and not from user requests.

The Four Models and Their Uses

The four models I am using and the features for which they are being used are:

  1. gpt-3.5-turbo (OpenAI): Today, Forecast, Guide
  2. gemini-pro (Google): Today, Forecast, Guide
  3. llama-3-sonar-small-32k-online (Perplexity): News
  4. claude-3-haiku-20240307 (Anthropic): Forecast, Guide

To see these live: https://www.wxnow.app Also available in App store and Google, but cost is $1 to cover Apple tax. So if you want to use for free without browser chrome, open in Safari and “Add to Home”.

In the app, under the Address line there are 4 to 5 text links. These are the features I have built using the LLMs. The first is the Today summary, the second is the daily Forecast, the third is regional News, and the fourth is the regional travel Guide. The fifth is the minutely forecast for the next hour forecast chart which only shows if there is rain forecasted in the next hour. The next hour forecast is not coming from the LLMs.

Implementation

Signing up for API keys and having these models read prompts and return values took a couple of hours at most; after all, they are just APIs. The difference with LLMs versus other APIs is that they no longer have static contracts. By this, I mean that as a developer, I can’t rely on the results returned by LLMs. If I give a weather API a latitude and longitude, I will always get the same fields with known value types in a format I can use programmatically. Using LLMs, I pass a text blob and get returned a text blob. So, it’s really still an API, but interacting with it is an entirely new game. Due to the openness of this contract, we now can set up the infrastructure easily, but getting the content right is going to take much longer. In fact, it very much follows the 80/20 rule (Pareto Principle), and I am not sure at times if the last 20% is attainable with general LLMs. Let me explain some of the challenges I’ve encountered building these features.

Building the Guide Feature

I naively jumped in thinking I have a location, what could I do with that using AI. So, I started simply asking chatGPT “Tell me something interesting about ‘location’”. It returned “fun” facts like:

  • Seattle's Famous Gum Wall
  • Seattle Buys More Sunglasses Per Capita
  • The Fremont Troll Sculpture

These are fun facts, but I started to think I would like more than just Fun facts, what about hiking trails around the area? These lead to many other questions, so I ended up building an array of text topics and randomly cycle those to try and keep some control over the outcome and making it more of a travel guide than a fun fact generator. My current random list is:

  • Geography
  • History
  • Demographics
  • Culture
  • Government and Politics
  • Infrastructure
  • Food and Cuisine
  • Sports and Recreation
  • Environment
  • Community and Lifestyle
  • Indigenous People
  • Local Attractions
  • Shopping
  • Events and Festivals
  • Nightlife
  • Transportation
  • Accommodations
  • Travel Tips
  • Wildlife and Nature
  • Local Crafts and Artisans
  • Health and Wellness
  • Day Trips and Excursions

With these my final prompt for my travel guide becomes:

const location = displayAddress
  ? displayAddress.split(',').slice(1).join(', ')
  : `longitude: ${coordinates?.longitude} 
    and latitude: ${coordinates?.latitude}`;

const getPrompt = (): string => {
  return `You are a travel guide assistant that
  provides information about geographic locations. 
  Provide a concise ${textLength ? textLength : 40} 
  word summary about ${getRandomTopic()} for 
  ${location}. Do not include the words "${location}" 
  in the response. Do not include the words "JSON", 
  "Format", or "concise ${textLength ? textLength : 40} 
  word summary" in the response.`;
};

It’s pretty cool and for the most part works quite well. It becomes clear how useful this could be for travel sites like Expedia, etc. A challenge I am discovering is that AI will always give me an answer, right or wrong. This is a bit annoying as it’s easier when things come back and say sorry, don’t have that answer, but instead we’ll get a weather forecast that is completely made up for example. At this point I am still unsure as to whether this is due to the prompt, or temperature. I have tweaked both but even at low temperatures am given consistently wrong answers. It is easier to validate an incorrect answer with weather as you will see, but becomes harder in the Guide and with News as the anwers at first glance look reasonable. However, even those are pretty obviously incorrect. If I search in the middle of Olympic National Park, Washinton, USA, ie no city, etc just using coordinates - latitude/longitude and ask about shopping areas, I’ll get answers like:

Shopping in Tokyo offers a diverse experience, from luxury boutiques in Ginza to quirky fashion in Harajuku. Explore bustling shopping districts like Shibuya and Akihabara for electronics and anime goods. Don't miss traditional markets such as Tsukiji and Ameya-Yokocho for authentic Japanese products and street food.

Building the News Feature

While building the guide feature, I discovered ChatGPT’s limitation regarding real-time data. For real-time news, I turned to Perplexity, which provides up-to-date information. So far, it does a good job of returning the latest news for large cities. I believe other models are incorporating real-time information, but they are still too costly for my needs.

The prompt I use for news is:

const { displayAddress, latitude, longitude } = props;
const location = displayAddress
  ? displayAddress.split(',').slice(1).join(', ')
  : `longitude: ${longitude} and latitude: ${latitude}`;

const getNewsPrompt = (): string => {
  return `Today is ${new Date().toString()}. You are a news assistant. Report 3 recent local news headlines around ${location} that have occurred in the past 24 hours. Order the results chronologically with the most recent news event first. Return only the headline not the description. If no local news respond with "No local news". If language is not english translate content into english.
    
    EXAMPLE OUTPUT:
    ###TITLE ** (MONTH DAY, YEAR) ** SOURCE
    ###TITLE ** (MONTH DAY, YEAR) ** SOURCE
    ###TITLE ** (MONTH DAY, YEAR) ** SOURCE
    ...
    
    `;
};

After getting good results from the News output, I decided to try and build out an entire regional news page. First attempt was to force JSON output so I could loop through for the display. This turned out to be extremely slow and not consistent. In fact with a hobby vercel account and default 10s timeout, I would get more application timeout errors that results. I changed to parsing the text blog using hashtags and asterisks and this sped things up exponentially. I am not sure why asking for JSON slowed things down so much, but it did. Eventhough it would give me json, that json was still part of a text blob and getting that block out and formatting became a large task. Using hashtags and asterisks I can easily parse the text blob and format the results and all LLMs seem to play well with this concept. The prompt for the regional news is as follows:

const { displayAddress, latitude, longitude } = props;
const location = displayAddress
  ? displayAddress.split(',').slice(1).join(', ')
  : `longitude: ${longitude} and latitude: ${latitude}`;

const getNewsPrompt = (): string => {
  return `Today is ${new Date().toString()}. You are a news assistant. Report the most recent local or regional news headlines around ${location}. Order the results chronologically with the most recent news event first. If no local news respond with "No local news". If language is not english translate content into english. Your output should prioritize being less than 350 words total length. Use ### to separate articles. Use '**' between the TITLE and DESCRIPTION. Provide the SOURCE of the news headline at the end of each DESCRIPTION. 
    
    Example Output:
    ### TITLE ** MONTH DAY, YEAR ** DESCRIPTION ** Source: SOURCE
    ### TITLE ** MONTH DAY, YEAR ** DESCRIPTION ** Source: SOURCE
    ### TITLE ** MONTH DAY, YEAR ** DESCRIPTION ** Source: SOURCE
    ...
    
    `;
};

To see the regional news go to https://www.wxnow.app Click on "News" below the address line Once News loads you will see a link for "More News" which will take you to the regional news page.

Building the Today and Forecast Features

After seeing real-time data, I thought of asking Perplexity, "What is the ideal time to go on a walk today in location X?". Try this yourself to see the results. You will see that the answers were generic, suggesting that morning is generally better than midday but not providing specific times or weather conditions. I realized that the term "ideal" is context-dependent and therefore open to interpretation, or in our case, needing more definition. After all, what are "ideal" conditions? Some people might think 60°F is the perfect temperature while others might think 72°F is perfect. So, this led me to try and define "ideal."

Second, no model has worldwide forecast data yet. So asking for weather information by coordinate is not possible. However, I have the weather data and thought maybe I can pass that information into the LLM and ask questions about it. Essentially, summarize my JSON data. This is where things start to get interesting, and we get some pretty cool answers, and also some pretty big misses.

Due to the cost of the models, the larger the question or the larger the answer, the greater the cost. Words = money. So, my prompt with the new weather data just got huge. At first, I tried to push all 10 days of hourly data to the LLM, which was a bad idea. Too many tokens. So, I had to break the data into something useful, and since I only need roughly one day, I condensed the 10 days to 2 days or 48 hours. The query is still large, but I think still within reasonable cost. I still don't know the actual cost as it has only been a few days with the new 48-hour query, so it may become 24 hours.

Third, trying to define "Ideal" is too subjective. So, rethinking my approach, I decided to have the LLMs generate a one-time JSON object of weather conditions, their range, and what those ranges felt like to people, and had it do so in JSON format. You can see the list below I called weatherConditions. This way, I can ask the LLM to summarize the weather data and then use the JSON object to determine if the weather is ideal or not. I am not sure if this is the best approach, but it seems to work well.

I also provided the LLMs definitions for the JSON weather data keys, thinking it could use these to describe the given weather data. This also seems to help provide a better, more consistent answer, but I am not sure if it is necessary.

Given the new prompt, the LLMs are able to generate a weather summary that is more accurate and contextually relevant, however, I am not sure I am getting the best results, and still see weird anomalies like it telling me to expect extremely heavy rain when there is no rain in the provided forecast data.

Below is the prompt and supporting definitions for the weather summary, the only difference for the Forecast feature is that I am using Daily data instead of Hourly data, but the prompt is essentially the same. Surprisingly, the Forecast feature is more accurate than the Today feature. I am not sure why, but I think it has to do with the fact that the LLMs are better at summarizing smaller JSON prompts than larger ones. Daily is only 9 objects, and hourly is 48 objects.

Due to the text showing on a mobile device and given the area I have to work with in my application, I have tried to keep the text to a specific word limit. This is next to impossible with the LLMs as they are not consistent in the number of words they return. So, I have added a textLength variable to the prompt to try and keep the text to a specific length. This is not working as well as I would like, but it is better than nothing and for the majority keeps the text to a reasonable length. If I try to control this with tokens, I end up with text that is cut off, it just stops at that length of tokens if it is too long. So, I have to use words and hope for the best. I can, of course, put in a container with a scroll bar, but I don't like the design of scrolling text in the middle of a mobile page, so it's not an option from a design perspective.

Today feature prompt and definitions are as follows:


const weatherKeys = {
  MoonPhases: 'Phases of the moon.',
  cloudCover: {
    type: 'percentage',
    description:
      'The percentage of the sky covered with clouds during the period, values in percent.',
  },
  conditionCode: {
    type: 'string',
    description:
      'weather condition at the time, ex "rain", "snow", "clear", "cloudy", "partly cloudy"',
  },
  daylight: {
    type: 'boolean',
    description: 'Indicates whether the hour starts during the day or night.',
  },
  forecastStart: {
    type: 'string',
    description: 'The starting date and time of the forecast.',
  },
  humidity: {
    type: 'percentage',
    description: 'The relative humidity at the start of the hour',
  },
  precipitationIntensity: {
    type: 'number',
    description: 'The amount of precipitation forecasted to occur during the period, in mm/hr.',
  },
  precipitationChance: {
    type: 'percentage',
    description: 'The chance of precipitation forecasted to occur during the hour',
  },
  precipitationType: {
    type: 'string',
    description:
      'The type of precipitation forecasted to occur during the period. ex "rain", "snow"',
  },
  pressure: {
    type: 'string',
    description: 'The direction of change of the sea-level air pressure.',
  },
  snowfallIntensity: {
    type: 'number',
    description: 'The rate at which snow crystals are falling, in millimeters per hour.',
  },
  temperature: {
    type: 'number',
    description: 'The temperature at the start of the hour.',
  },
  temperatureApparent: {
    type: 'number',
    description:
      'The feels-like temperature when considering wind and humidity, at the start of the hour.',
  },
  uvIndex: {
    type: 'number',
    description: 'The level of ultraviolet radiation at the start of the hour.',
  },
  visibiiity: {
    type: 'number',
    description: 'The distance at which terrain is visible at the start of the hour, in meters.',
  },
  windSpeed: {
    type: 'number',
    description: 'The wind speed at the start of the hour.',
  },
  windGust: {
    type: 'number',
    description: 'The maximum wind gust speed during the hour.',
  },
};

const weatherConditions = {
  Precipitation: {
    importance: 1,
    precipitationIntensity: [
      {
        amount: '0.05 - 2.5 mm/hr',
        description: 'Drizzle or very light rain.',
        impact:
          'Generally, this is not disruptive. People can be outside with minimal discomfort, though an umbrella or light raincoat is advisable.',
        outsideActivities: 'fair',
      },
      {
        amount: '2.6 - 7.6 mm/hr',
        description: 'Moderate Rain. Steady rain that can make surfaces wet and slightly slippery.',
        impact:
          'Walking outside is manageable with proper rain gear. Roads and sidewalks may become slick, and visibility may decrease slightly.',
        outsideActivities: 'poor',
      },
      ...
    ],
    precipitationChance: [
      {
        amount: '0% - 19%',
        description: 'Low chance of precipitation.',
        impact: 'Outdoor activities are unlikely to be affected by rain.',
        outsideActivities: 'good',
      },
      {
        amount: '20% - 50%',
        description: 'Moderate chance of precipitation.',
        impact: 'Rain may occur, but it is not guaranteed. Be prepared for possible showers.',
        outsideActivities: 'bad',
      },
      ...
    ],
  },
  Temperature: {
    importance: 2,
    range: [
      {
        amount: 'Below 0°F (-18°C)',
        description: 'Extremely cold',
        impact:
          'This can be life-threatening if you are not dressed properly. Frostbite and hypothermia can occur quickly.',
        outsideActivities: 'very bad',
      },
      {
        amount: '0°F to 32°F (-18°C to 0°C)',
        description: 'Very cold',
        impact:
          'Necessary to wear multiple layers, including a heavy coat, gloves, and a hat. Frostbite and hypothermia are still risks. Avoid being outside for extended periods.',
        outsideActivities: 'bad',
      },
      ...
    ],
  },
  ...
};

const hourWeather = [
  {
    "conditionCode": "MostlyClear",
    "daylight": false,
    "cloudCover": "24%",
    "forecastStart": "Thu Jun 20 2024 00:00:00 GMT-0700 (Pacific Daylight Time)",
    "humidity": "77%",
    "precipitationChance": "0%",
    "precipitationIntensity": "0mm/hr",
    "precipitationType": "clear",
    "pressure": "rising",
    "temperature": "56°F",
    "temperatureApparent": "55°F",
    "uvIndex": 0,
    "windSpeed": "2 mph",
    "visibility": "23166 meters",
    "windGust": "5 mph"
  },
  {
    "conditionCode": "MostlyClear",
    "daylight": false,
    "cloudCover": "26%",
    "forecastStart": "Thu Jun 20 2024 01:00:00 GMT-0700 (Pacific Daylight Time)",
    "humidity": "79%",
    "precipitationChance": "0%",
    "precipitationIntensity": "0mm/hr",
    "precipitationType": "clear",
    "pressure": "steady",
    "temperature": "55°F",
    "temperatureApparent": "54°F",
    "uvIndex": 0,
    "windSpeed": "2 mph",
    "visibility": "22830 meters",
    "windGust": "5 mph"
  },
  ...
];

const getPrompTodayWeather = (): string => {
    return `DATA: Weather data in a stringified array of hourly
    key-value pair, JSON objects: ${JSON.stringify(hourWeather)}.
    The Weather data keys are defined in this stringified JSON
    object ${JSON.stringify(weatherKeys)}.

    SUMMARY: As a weather assistant you will analyze the given
    Weather data and provide a weather forecast incorporating
    known weather averages for ${location}. Use the Weather
    condition ranges which are listed by importance in a
    stringified array of JSON objects between the tags <keys>
    </keys> to describe the weather forecast. Your output will
    be a concise ${textLength ? textLength : 40} word weather
    forecast for each day in the data.
    <keys>${JSON.stringify(weatherConditions)}</keys>

    RULES:
      - Report bad weather conditions that significantly impact
       outdoor activities. include the time ranges when these
       conditions occur.
      - Report significant deviations from averages. Use the word
       "changes" instead of "deviations" to indicate deviations.
      - Do not include terms like "50 word summary," "JSON,"
       "Format," or "JSON Format" in your response.
      - Use standard time format (e.g., 4:00 PM).
      - Do not use past tense tone of voice for forecast.
      - If using ranges in your response also provide the time
       of those ranges.
      - Wednesday instead of **Wednesday, June 19, 2024:**
    `;
  };

Challenges

Working with LLMs to generate weather summaries, news, and location guides has been both challenging and rewarding. Some of the challenges I have encountered include:

  • LLM capabilities: Understanding the strengths and limitations of each model.
  • Consistent output: Consistently generating accurate and contextually relevant content.
  • Order of responses: I have found that the order of responses is not always the same. This is a bit annoying as I would like to have the same order of responses for the same prompt. I have not found a way to do this yet. Sometimes I get temperature ranges from lowest to highest, sometimes highest to lowest. This is not a huge issue, but it would be nice to have the same order of responses, and in ways humans expect to see values. This was very difficult to get right, and I stopped trying as I have found adding the right rules helps; too many rules contradict each other, and the LLMs get confused.
  • Text Length: Keeping the text to a specific word limit. This is next to impossible with the LLMs as they are not consistent in the number of words they return. So, I have added a textLength variable to the prompt to try and keep the text to a specific length. This is not working as well as I would like, but it is better than nothing and for the majority keeps the text to a reasonable length. Tokens are not a good way to control this as the text just stops at that length of tokens if it is too long. So, I have to use words and hope for the best.
  • Real-Time Information: Managing the limitations of real-time data availability. Most LLMs will return a response even if they don't have the information. This can lead to inaccurate or irrelevant content. Seems to be a lot of reported small earthquakes when searching news. I think the terminology is that LLMs hallucinate. Good for idea creation, not so much for factual content.
  • Speed: Ensuring the feature is responsive and generates content quickly. LLMs are really slow. They even offer streaming, which allows the user to start responding early to engage users; it's a smart UI technique that makes the current LLM UIs usable. However, the feature I'm implementing doesn't benefit from streaming as it's a one-time response that is fast enough. However, as I start trying to do more with regional news, I am realizing streaming may be necessary to keep the user engaged.
  • Cost: Keeping costs low is not easy. It appears from my experience with implementing these features is that in order to get more accurate answers, I need to have larger inputs. Essentially, better answers cost more.
  • Accuracy: Hard to ensure the generated content is accurate when trying to create news on a global scale at a regional level, for example. I would want to do a lot of testing if creating a global feature with thousands of users.

Conclusion

The LLMs in their current form are slow, costly, and ultimately better suited for conversational applications. It is difficult to incorporate a useful feature that requires a lot of information and is not conversational. To be fair, I am using the LLMs in a way that is probably not the intended use case and better suited for customized models. I am still learning and was hoping the LLMs could be general sources of information on a global scale. I am finding that is not the case, and perhaps in the future, LLMs' logic will be more incorporated into specific APIs such as weather or news, where a developer could ask more targeted questions. A weather model incorporated into a traditional API might be able to handle more open-ended questions such as "When would the best weekend to go windsurfing in the month of June 2025?" or "What weekend in September 2027 on Cape Cod would be best for a wedding. There should be no rain, little to no clouds, and temperatures in the 70s." This could then return JSON in a formatted request instead of a text string, making the results more useful for development.

For news, it would be cool to get regional news from around the world and even real-time translation. Global newspapers are on the decline, so it would be interesting to compile regional news into a model using things like Nextdoor, city crime maps, city and county information, and allow querying on that dataset.

LLMs are cool and very powerful. In certain use cases, they can be an extremely powerful tool for increasing productivity. Just take search and replace or generating new ideas, or even generating simple lists of content, such as creating weather condition lists. However, it is not a silver bullet. Trying to use it to generate content on the fly for a text summary feature has proven difficult. It is just not reliable enough yet to feel comfortable that it will consistently deliver what I expect, and depending on the task, can require a lot of work to get useful results for features that really just might be "meh" to the user.

Developer: "Check it out, LLMs are making that weather summary."

User: "Cool, I don't need to read a text weather summary."

I think LLMs, as they get more information and therefore more intelligent, will become more useful. However, they are still young.