• 0 Posts
  • 1.46K Comments
Joined 2 年前
cake
Cake day: 2023年6月16日

help-circle





  • The issue here is that we’ve well gone into sharply exponential expenditure of resources for reduced gains and a lot of good theory predicting that the breakthroughs we have seen are about tapped out, and no good way to anticipate when a further breakthrough might happen, could be real soon or another few decades off.

    I anticipate a pull back of resources invested and a settling for some middle ground where it is absolutely useful/good enough to have the current state of the art, mostly wrong but very quick when it’s right with relatively acceptable consequences for the mistakes. Perhaps society getting used to the sorts of things it will fail at and reducing how much time we try to make the LLMs play in that 70% wrong sort of use case.

    I see LLMs as replacing first line support, maybe escalating to a human when actual stakes arise for a call (issuing warranty replacement, usage scenario that actually has serious consequences, customer demanding the human escalation after recognizing they are falling through the AI cracks without the AI figuring out to escalate). I expect to rarely ever see “stock photography” used again. I expect animation to employ AI at least for backgrounds like “generic forest that no one is going to actively look like, but it must be plausibly forest”. I expect it to augment software developers, but not able to enable a generic manager to code up whatever he might imagine. The commonality in all these is that they live in the mind numbing sorts of things current LLM can get right and/or a high tolerance for mistakes with ample opportunity for humans to intervene before the mistakes inflict much cost.



  • I’ve found that as an ambient code completion facility it’s… interesting, but I don’t know if it’s useful or not…

    So on average, it’s totally wrong about 80% of the time, 19% of the time the first line or two is useful (either correct or close enough to fix), and 1% of the time it seems to actually fill in a substantial portion in a roughly acceptable way.

    It’s exceedingly frustrating and annoying, but not sure I can call it a net loss in time.

    So reviewing the proposal for relevance and cut off and edits adds time to my workflow. Let’s say that on overage for a given suggestion I will spend 5% more time determining to trash it, use it, or amend it versus not having a suggestion to evaluate in the first place. If the 20% useful time is 500% faster for those scenarios, then I come out ahead overall, though I’m annoyed 80% of the time. My guess as to whether the suggestion is even worth looking at improves, if I’m filling in a pretty boilerplate thing (e.g. taking some variables and starting to write out argument parsing), then it has a high chance of a substantial match. If I’m doing something even vaguely esoteric, I just ignore the suggestions popping up.

    However, the 20% is a problem still since I’m maybe too lazy and complacent and spending the 100 milliseconds glancing at one word that looks right in review will sometimes fail me compared to spending 2-3 seconds having to type that same word out by hand.

    That 20% success rate allowing for me to fix it up and dispose of most of it works for code completion, but prompt driven tasks seem to be so much worse for me that it is hard to imagine it to be better than the trouble it brings.



  • As someone who tries to keep the vague number in mind, it would be strange to me as well, but I suspect a large number of people don’t really try to keep even the vague numbers in mind about how many people are about or how many people realistically could reside in a place like NYC.

    They track the rough oversimplifications. Like “barely anyone lives in the middle of the country”, and every TV show they see in the US either has a bunch of background people in NYC or LA, or is in the middle of nowhere with a town seemingly made up of mere dozens of people. They might know that “millions” live in the US and also, “millions” live in NYC, so same “ballpark” if they aren’t keeping track of the specifics. They’d probably believe 10 million in NYC and 50 million nationwide.

    This is presuming they bother to follow through on the specific math rather than merely roughly throwing out a percentage.




  • I’m extremely skeptical of any organized religion, where divine authority is asserted for the words said/written by some dudes, but I’m not going to close the door on something beyond what we can know.

    But no one’s guess carries any more weight than another, no person should be assumed to have an inherently more valid relationship with divinity than another.

    So I have a bit of a vague faith, but not in any concrete concepts put forth by religion, since I have no reason to think their guess would be any better than a guess I could make on my own, and someone’s ability to think otherwise seems a very dangerous reality.

    It’s not anything actionable, just more a hope that there’s more to things than we see.






  • It’s not like the road test is particularly rigorous. Worth while to administer and you have to be in super bad shape if the person even notices you doing anything off, so it’s not like the risk is high.

    Though that written test, I took a practice one and my driving experience did not keep me in shape to pass those… Of course the questions are stupid like “which of the following violations carries the harshest penalty” or “exactly how many feet away from an intersection must you park when doing street parking on an unmarked street”




OSZAR »