Applying AI to improve scalability
I was evaluating the implementation of a process for parsing multiple files and uploading the resulting generated file as an object in an S3 bucket.
The main consideration was scalability, as the volume of incoming data was soon going to increase significantly and we did not want to have to scale up our service instances just to cope with this particular use case.
The existing process involved creating a temporary file and writing to it before uploading to S3. This was going to be problematic as the virtual environment instance size involved was not going to have enough disk space to handle the larger data sets that we knew would be coming in the next milestone.
I looked into bypassing the temporary file generation and uploading straight to S3. This would involve adjusting the setup to use some multi-part upload API calls instead of the single step of writing to S3.
I explained the situation to the AI command line interface and watched as it summarized its approach and re-factored the implementation and tests to involve buffering and multiple calls to upload the content without involving any temporary file.
On the surface everything looked good, but as I read through some comments that the AI had included in the code something started to show up as problematic...
It didn't do what I asked for
The comments mentioned that it was not using multipart upload, even though I had specified that in the instructions.
Instead of progressively uploading the content to the destination object in S3, this AI generated implementation would write each buffer of data as the content of the destination object. This would mean that only the last chunk of data would ever be included at the end of processing.
The importance of attention to detail
I had to push further to have some tests generated that would surface up the flaw of the implementation. In this instance it involved processing data that exceeded a single buffer size, and validating that the content presented to the destination would be complete.
Still a need for the human in the loop
Without using very much imagination, I can envisage how teams could easily be seduced by the perception that AI can efficiently generate software with less need for their reviewing input. Based on speculative articles already circulating online, I think we may have already seen that from companies such as AWS.
Do we have to spell it out?
If AI is only capable of following strict instructions from someone who already knows enough of the details to recognize when it is done, then is it just the equivalent of outsourcing?
Will there be a cycle of "We can outsource now" and "We need to bring that back in house now"?
No comments:
Post a Comment