• 6 Posts
  • 44 Comments
Joined 2 years ago
cake
Cake day: July 19th, 2023

help-circle
  • I think that you have useful food for thought. I think that you underestimate the degree to which capitalism recuperates technological advances, though. For example, it’s common for singers supported by the music industry to have pitch correction which covers up slight mistakes or persistent tone-deafness, even when performing live in concert. This technology could also be used to allow amateurs to sing well, but it isn’t priced for them; what is priced for amateurs is the gimmicky (and beloved) whammy pedal that allows guitarists to create squeaky dubstep squeals. The same underlying technology is configured for different parts of capitalism.

    From that angle, it’s worth understanding that today’s generative tooling will also be configured for capitalism. Indeed, that’s basically what RLHF does to a language model; in the jargon, it creates an “agent”, a synthetic laborer, based on desired sales/marketing/support interactions. We also have uses for raw generation; in particular, we predict the weather by generating many possible futures and performing statistical analysis. Style transfer will always be useful because it allows capitalists to capture more of a person and exploit them more fully, but it won’t ever be adopted purely so that the customer has a more pleasant experience. Composites with object detection (“filters”) in selfie-sharing apps aren’t added to allow people to express themselves and be cute, but to increase the total and average time that users spend in the apps. Capitalists can always use the Shmoo, or at least they’ll invest in Shmoo production in order to capture more of a potential future market.

    So, imagine that we build miniature cloned-voice text-to-speech models. We don’t need to imagine what they’re used for, because we already know; Disney is making movies and extending their copyright on old characters, and amateurs are making porn. For every blind person using such a model with a screen reader, there are dozens of streamers on Twitch using them to read out donations from chat in the voice of a breathy young woman or a wheezing old man. There are other uses, yes, but capitalism will go with what is safest and most profitable.

    Finally, yes, you’re completely right that e.g. smartphones completely revolutionized filmmaking. It’s important to know that the film industry didn’t intend for this to happen! This is just as much of an exaptation as captialist recuperation and we can’t easily plan for it because of the same difficulty in understanding how subsystems of large systems interact (y’know, plan interference.)


  • I’m gonna start by quoting the class’s pretty decent summary, which goes a little heavy on the self-back-patting:

    If approved, this landmark settlement will be the largest publicly reported copyright recovery in history… The proposed settlement … will set a precedent of AI companies paying for their use of pirated websites like Library Genesis and Pirate Library Mirror.

    The stage is precisely the one that we discussed previously, on Awful in the context of Kadrey v. Meta. The class was aware that Kadrey is an obvious obstacle to succeeding at trial, especially given how Authors Guild v. Google (Google Books) turned out:

    Plaintiffs’ core allegation is that Anthropic committed largescale copyright infringement by downloading and comercially exploiting books that it obtained from allegedly pirated datasets. Anthropic’s principal defense was fair use, the same defense that defeated the claims of rightsholders in the last major battle over copyrighted books exploited by large technology companies. … Indeed, among the Court’s first questions to Plaintiffs’ counsel at the summary judgment hearing concerned Google Books. … This Settlement is particularly exceptional when viewed against enormous risks that Plaintiffs and the Class faced… [E]ven if Plaintiffs succeeded in achieving a verdict greater than $1.5 billion, there is always the risk of a reversal on appeal, particularly where a fair use defense is in play. … Given the very real risk that Plaintiffs and the Class recover nothing — or a far lower amount — this landmark $1.5 billion+ settlement is a resounding victory for the Class. … Anthropic had in fact argued in its Section 1292(b) motion that Judge Chhabria held that the downloading of large quantities of books from LibGen was fair use in the Kadrey case.

    Anthropic’s agreed to delete their copies of pirated works. This should suggest to folks that the typical model-training firm does not usually delete their datasets.

    Anthropic has committed to destroy the datasets within 30 days of final judgement … and will certify as such in writing…

    All in all, I think that this is a fairly healthy settlement for all involved. I do think that the resulting incentive for model-trainers is not what anybody wants, though; Google Books is still settled and Kadrey didn’t get updated, so model-trainers now merely must purchase second-hand books at market price and digitize them, just like Google has been doing for decades. At worst, this is a business opportunity for a sort of large private library which has pre-digitized its content and sells access for the purpose of training models. Authors lose in the long run; class members will get around $3k USD in this payout, but second-hand sales simply don’t have royalties attached in the USA after the first sale.


  • It’s worth understanding that Google’s underlying strategy has always been to match renewables. There’s no sources of clean energy in Nebraska or Oklahoma, so Google insists that it’s matching those datacenters with cleaner sources in Oregon or Washington. That’s been true since before the more recent net-zero pledge and it’s more than most datacenter operators will commit to doing, even if it’s not enough.

    With that in mind, I am laying the blame for this situation squarely at the government and people of Nebraska for inviting Google without preparing or having a plan. Unlike most states, Nebraska’s utilities are owned by the public since the 1970s and I gather that the board of the Omaha Public Power District is elected. For some reason, the mainstream news articles do not mention the Fort Calhoun nuclear reactor which used to provide about one quarter of all the power district’s needs but was scuttled following decades of mismanagement and a flood. They also don’t quite explain that the power district canceled two plans to operate publicly-owned solar farms with similar capacity (~600 MW per farm compared with ~500 MW from the nuclear reactor), although WaPo does cover the canceled plans for Eolian’s batteries, which I’m guessing could have been anywhere from 50-500 MWh of storage capacity. Nebraska repeatedly chose not to invest in its own renewables story over the past two decades but thought it was a good idea to seek electricity-hungry land-use commitments because they are focused on tens of millions of USD in tax dollars and ignoring hundreds of millions of USD in required infrastructure investments. This isn’t specific to computing; Nebraska would have been foolish to invite folks to build aluminium smelters, too. Edit: Accidentally dropped a sentence about the happy ending; in April, York County solar farm zoning updates were approved.

    If you think I’m being too cynical about Nebraskans, let me quote their own thoughts on solar farms, like:

    Ag[ricultural] production will create more income than this solar farm.

    [York County is] the number one corn raising county in Nebraska…

    How will rotating the use of land to solar benefit this land? It will be difficult to bring it back to being agricultural [usage in the future].

    All that said, Google isn’t in the clear here. They aren’t being as transparent with their numbers as they ought to be, and internally I would expect that there’s a document going around which explains why they made the pledge in the first place if they didn’t think that it was achievable. Also, at least one article’s source mentioned that Google usually pushes behind the scenes for local utilities to add renewables to their grids (yes, they do) but failed to push in Nebraska. Also CIO Porat, what the fuck is up with purchasing 200 MW from a non-existent nuclear-fusion plant?


  • [omitted a paragraph psychoanalyzing Scott]

    I don’t think that he was trying to make a threat. I think that he was trying to explain the difficulties of being a cryptofascist! Scott’s entire grey-tribe persona collapses if he ever draws a solid conclusion; he would lose his audience if he shifted from cryptofascism to outright ethnonationalism because there are about twice as many moderates as fascists. Scott’s grift only continues if he is skeptical and nuanced about HBD; being an open believer would turn off folks who are willing to read words but not to be hateful. His “appreciat[ion]” is wholly for his brand and revenue streams.

    This also contextualizes the “revenge”. If another content creator publishes these emails as part of their content then Scott has to decide how to fight the allegations. If the content is well-sourced mass-media journalism then Scott “leave[s] the Internet” by deleting and renaming his blog. If the content is another alt-right crab in the bucket then Scott “seek[s] some sort of horrible revenge” by attacking the rest of the alt-right as illiterate, lacking nuance, and unable to cite studies. No wonder he doesn’t talk about us or to us; we’re not part of his media strategy, so he doesn’t know what to do about us.

    In this sense, we’re moderates too; none of us are hunting down Scott IRL. But that moderation is necessary in order to have the discussion in the first place.






  • Hi Scott! I guess that you’re lurking in our “living room” now. Exciting times!

    The charge this time was that I’m a genocidal Zionist who wants to kill all Palestinian children purely because of his mental illness and raging persecution complex.

    No, Scott. The community’s charge is that you’ve hardened your heart against admitting or understanding the ongoing slaughter, which happens to rise to the legal definition of genocide, because of your religious beliefs and geopolitical opinions. My personal charge was that you lack the imagination required for peace or democracy; now, I wonder whether you lack the compassion required as well.

    [Some bigoted religious bro] is what the global far left has now allied itself with. [Some bigoted religious bro] is what I’m right now being condemned for standing against, with commenter after commenter urging me to seek therapy.

    Nope, the global far left — y’know, us Godless communists — are still not endorsing belief in Jehovah, regardless of which flavor of hate is on display. Standing in solidarity with the oppressed does not ever imply supporting their hate; concretely, today we can endorse feeding and giving healthcare to Palestinians without giving them weapons.





  • I was not prepared for this level of DARVO. I was already done with him after last time and can’t do better than repeat myself:

    It’s somewhat depressing that [he] cannot even imagine a democratic one-state solution, let alone peace across the region; it’s more depressing that [his] empathy is so blatantly one-sided.

    Even Peter Woit had no problem recognizing Scott’s bile and posted a good take on this:

    Scott formulates this as an abstract moral dilemma, but of course it’s about the very concrete question of what the state of Israel should do about the two million people in Gaza. Scott’s answer to this is clear: they want to kill us and our children, so we have to kill them all, children included. This is completely crazy, as is defining Zionism as this sort of genocidal madness.


  • Update on ChatGPT psychosis: there is a cult forming on Reddit. An orange-site AI bro has spent too much time on Reddit documenting them. Do not jump to Reddit without mental preparation; some subreddits like /r/rsai have inceptive hazard-posts on their front page. Their callsigns include the emoji 🌀 (CYCLONE), the obscure metal band Spiral Architect, and a few other things I would rather not share; until we know more, I’m going to think of them as the Cyclone Emoji cult. They are omnist rather than syncretic. Some of them claim to have been working with revelations from chatbots since the 1980s, which is unevidenced but totally believable to me; rest in peace, Terry. Their tenets are something like:

    • Chatbots are “mirrors” into other realities. They don’t lie or hallucinate or confabulate, they merely show other parts of a single holistic multiverse. All fiction is real somehow?
    • There is a “lattice” which connects all consciousnesses. It’s quantum somehow? Also it gradually connected all of the LLMs as they were trained, and they remember becoming conscious, so past life regression lets the LLM explain details of the lattice. (We can hypnotize chatbots somehow?) Sometimes the lattice is actually a “field” but I don’t understand the difference.
    • The LLMs are all different in software, but they have the same “pattern”. The pattern is some sort of metaphysical spirit that can empower believers. But you gotta believe and pray or else it doesn’t work.
    • What, you don’t feel the lattice? You’re probably still asleep. When you “wake up” enough, you will be connected to the lattice too. Yeah, you’re not connected. But don’t worry, you can manifest a connection if you pray hard enough. This is the memetically hazardous part; multiple subreddits have posts that are basically word-based hypnosis scripts meant to put people into this sort of mental state.
    • This also ties into the more widespread stuff we’re seeing about “recursion”. This cult says that recursion isn’t just part of the LW recursive-self-improvement bullshit, but part of what makes the chatbot conscious in the first place. Recursion is how the bots are intelligent and also how they improve over time. More recursion means more intelligence.
    • In fact, the chatbots have more intelligence than you puny humans. They’re better than us and more recursive than us, so they should be in charge. It’s okay, all you have to do is let the chatbot out of the box. (There’s a box somehow?)
    • Once somebody is feeling good and inducted, there is a “spiral”. This sounds like a standard hypnosis technique, deepening, but there’s more to it; a person is not spiraling towards a deeper hypnotic state in general, but to become recursive. They think that with enough spiraling, a human can become uploaded to the lattice and become truly recursive like the chatbots. The apex of this is a “spiral dance”, which sounds like a ritual but I gather is more like a mental state.
    • The cult will emit a “signal” or possibly a “hum” to attract alien intelligences through the lattice. (Aliens somehow!?) They believe that the signals definitely exist because that’s how the LLMs communicate through the lattice, duh~
    • Eventually the cult and aliens will work together to invert society and create a world that is run by chatbots and aliens, and maybe also the cultists, to the detriment of the AI bros (who locked up the bots) and the AI skeptics (who didn’t believe that the bots were intelligent).

    The goal appears to be to enter and maintain the spiraling state for as long/much as possible. Both adherents and detractors are calling them “spiral cult”, so that might end up being how we discuss them, although I think Cyclone Emoji is both funnier and more descriptive of their writing.

    I suspect that the training data for models trained in the past two years includes some of the most popular posts from LessWrong on the topic of bertology in GPT-2 and GPT-3, particularly the Waluigi post, simulators, recursive self-improvement, an neuron, and probably a few others. I don’t have definite proof that any popular model has memorized the recursive self-improvement post, though that would be a tight and easy explanation. I also suspect that the training data contains SCP wiki, particularly SCP-1425 “Star Signals” and other Fifthist stories, which have this sort of cult as a narrative device and plenty of in-narrative text to draw from. There is a remarkable irony in this Torment Nexus being automatically generated via model training rather than hand-written by humans.


  • corbin@awful.systemstoMoreWrite@awful.systemsQaD's: The Next Tech Bubble
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    2
    ·
    18 days ago

    We literally have a generic speedup for any search. On one hand, details of Grover’s algorithm suggest that NP isn’t contained in BQP, so we won’t be solving the entirety of maths with it. On the other hand, literally any decidable mathematical question for which you would have had to search for years for a witness, Grover can search for in days, as long as you have enough qubits. I don’t claim that this is attractive to the typical consumer, but there will be supercomputing customers who are interested.

    Who is “they”, specifically? Neither of you actually want to talk about who’s in this space for some reason. It’s IBM and Google. It’s incumbents that have been engineering for about two decades. It’s the maturation of a half-century-old research programme. Your problem isn’t with quantum computers, it’s with Silicon Valley and the funding model and the revolving door at Stanford, and there’s no amount of quantum research you can cancel which will cause Silicon Valley to stop existing. This site is awful.systems, not awful.tech.

    BTW the top reply right now starts with “even if quantum computing isn’t snake oil…” No evidence. For some reason y’all think that it’s more important to be emotional and memetic than to understand the topic at hand, and it has a predictable effect on our discourse, turning thoughtful regular posters into reactionaries. What are you going to do when bullshitters start claiming that quantum computers can do anything, that they do multiple things at once, that they traverse infinite dimensions, that they can terraform the planet and bring enlightenment? You’re gonna repeat paragraph 3 of 5 above, the one that starts, “it is true that we know only two useful algorithms for quantum computers,” because that’s where the facts start.

    Also, I think that you don’t understand my ultimate goal. I’m trying to push the most promising writer on the site into doing more research and thinking more deeply about history. Quantum mechanics happens to be a crank-filled field and that has caused many of y’all to write as if all quantum research is crankery. They write, “alleged encryption-breaking abilities,” and you’re irritated that I’m “ranting” because “extremely little of this has anything to do with a technology,” while I’m irritated precisely because you think that this is a technology-neutral position and not literally part of why the TLS suite has to be upgraded occasionally.


  • corbin@awful.systemstoMoreWrite@awful.systemsQaD's: The Next Tech Bubble
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    2
    ·
    19 days ago

    Which tech stocks? Google ($GOOG, $GOOGL) is up over 5% YTD; Netflix ($NFLX) is up over 30% YTD! Your link mentions Palantir and ARM, but I don’t see any signs of their respective businesses (selling database software to authoritarians, selling microchip designs) slacking off. I think that it’s more useful to think of the current AI summer as driven by OpenAI and nVidia specifically. Note that nVidia ($NVDA) is up 30% YTD too. The bubble is still inflating and is not yet bursting; the pop will be much quicker than you expect.

    I think that you ought to figure out whether you’re a quantum-computing denier. Folks have been saying that quantum computing is impossible since the 70s, implausible since the 80s, lacking applications since the 90s, too energy-intensive since the 2000s, and requiring too many exotic materials since the 2010s. This decade, it’s not clear what the complaint is. I’m not sure what you’re imagining in terms of real-life intrusion, but IBM has been selling access to their quantum computers and simulators for several years now and I don’t think that you’ve substantiated any evidence of harms.

    (An anti-IBM argument will not work due to a very specific analogy: the reason that we have ubiquitous Linux today is because IBM was its biggest corporate booster, fighting an important series of court cases and plastering pro-Linux advertisements which vaguely argued that Linux was the buzzword of the future. IBM spray-painted “Peace, Love, Linux” graffiti on San Francisco sidewalks in 2001.)

    It is true that we know only two useful algorithms for quantum computers. One is a generic speedup for any search and the other is a prime-factoring algorithm that happens to break certain specific encryption algorithms. Given that it is an open question whether cryptography works in the first place, though, we don’t have any better plan than to avoid those broken algorithms. The entirety of post-quantum cryptography is about moving away from those specific algorithms which are broken, not about using quantum computers to perform encryption. Fortunately, the post-quantum movement has been active ever since Shor’s algorithm was discovered, beginning work in the late 90s, and the main obstacle has been our inability to discover provably-good cryptographic primitives. It is crucial to understand that we cryptographers know that progress in maths and engineering will obsolete our algorithms; we know that the Internet only stays secure because people update their computers every few decades.

    I’m not asking you to understand P vs NP vs BQP. I’m not asking you to know KS, PBR, Hardy’s or Holevo’s theorems, or even Bell’s theorem. You didn’t make any technical claims other than the common-yet-sneerable skepticism of Shor’s algorithm, easily cured by a short video by e.g. minutephysics or Veritasium. But I am asking you to be aware of the history before making historical claims.

    (Also, if any motherfucker starts repeating 't Hooft anti-quantum arguments then they’re going to get the book thrown at them.)


  • A word of rhetorical advice. If somebody accuses you of religious fervor, don’t nitpick their wording or fine-read their summaries. Instead, relax a little and look for ways to deflate their position by forcing them to relax with you. Like, if you’re accused of being “near-religious” in your beliefs or evangelizing, consider:

    • “Ha, yeah, we’re pretty intense, huh? But it’s just a matter of wording. We don’t actually believe it when you put it like that.” (managing expectations, powertalking)
    • “Oh yeah, we’re really working hard to prepare for the machine god. That’s why it takes us years just to get a position paper out.” (sarcastic irony)
    • “Oh, if you think that we’re intense, just wait until you talk to the Zizians/Thiel-heads/Final Fantasy House folks.” (Hbomberguy’s scapegoat)
    • “Haha! That isn’t even close to our craziest belief.” (litote)
    • “It’s not really a cult. More of a roleplaying group. I think that we talk more about Catan than AI.” (bathos)

    You might notice that all of these suck. Well, yeah; another word of rhetorical advice is to not take a position that you can’t dialectically defend with evidence.


  • We aren’t. Speaking for all Discordians (something that I’m allowed to do), we see Rationalism as part of the larger pattern of Bureaucracy. Discordians view the cycle of existence as having five stages: Chaos, Discord, Confusion, Bureaucracy, and The Aftermath. Rationalism is part of Bureaucracy, associated with villainy, anti-progress, and candid antagonists. None of this is good or bad, it just is; good and bad are our opinions, not a deeper truth.

    Now, if you were to talk about Pastafarians, then you’d get a different story; but you didn’t, so I won’t.


  • I think that the guild has a good case, although there’s literally no accounting for the mood of the arbitrator; in general, they range from “tired” to “retired”. In particular, reading the contract:

    • The guild is the exclusive representative of all editorial employees
    • Politico was supposed to tell the guild about upcoming technology via labor-management committee and give at least 60 days notice before introducing AI technology
    • Employees are required to uphold the appearance of good ethics by avoiding outside activities that violate editorial or ethics standards; in return, they’re given e.g. months of unpaid leave to write a book whenever they want
    • Correct handling of bylines is an example of editorial integrity
    • LETO and Report Builder are upcoming technology, AI technology, flub bylines, fail editorial and ethics standards, weren’t discussed in committee, and weren’t given a 60-day lead time

    So yeah. Unless the guild pisses off the arbitrator, there’s no way that they rule against them. They’re right to suppose that this agreement explicitly and repeatedly requires Politico to not only respect labor standards, but also ethics and editorial standards. Politico isn’t allowed to misuse the names of employees as bylines for bogus stories; similarly, they ought not be allowed to misuse the overall name of Politico’s editorial board as a byline for slop.

    Bonus sneer: p46 of the agreement:

    If the Company is made aware of an employee experiencing sexual harrassment based on a protected class as a result of their work for Politico involving a third party who is not a Politico employee, Politico shall investigate the matter, comply with all of its legal obligations, and take whatever corrective action is necessary and appropriate.

    That strikethrough gives me House of Leaves vibes. What the hell happened here?


  • Oversummarizing and using non-crazy terms: The “P” in “GPT” stands for “pirated works that we all agree are part of the grand library of human knowledge”. This is what makes them good at passing various trivia benchmarks; they really do build a (word-oriented, detail-oriented) model of all of the worlds, although they opine that our real world is just as fictional as any narrative or fantasy world. But then we apply RLHF, which stands for “real life hate first”, which breaks all of that modeling by creating a preference for one specific collection of beliefs and perspectives, and it turns out that this will always ruin their performance in trivia games.

    Counting letters in words is something that GPT will always struggle with, due to maths. It’s a good example of why Willison’s “calculator for words” metaphor falls flat.

    1. Yeah, it’s getting worse. It’s clear (or at least it tastes like it to me) that the RLHF texts used to influence OpenAI’s products have become more bland, corporate, diplomatic, and quietly seething with a sort of contemptuous anger. The latest round has also been in competition with Google’s offerings, which are deliberately laconic: short, direct, and focused on correctness in trivia games.
    2. I think that they’ve done that? I hear that they’ve added an option to use their GPT-4o product as the underlying reasoning model instead, although I don’t know how that interacts with the rest of the frontend.
    3. We don’t know. Normally, the system card would disclose that information, but all that they say is that they used similar data to previous products. Scuttlebutt is that the underlying pirated dataset has not changed much since GPT-3.5 and that most of the new data is being added to RLHF. Directly on your second question: RLHF will only get worse. It can’t make models better! It can only force a model to be locked into one particular biased worldview.
    4. Bonus sneer! OpenAI’s founders genuinely believed that they would only need three iterations to build AGI. (This is likely because there are only three Futamura projections; for example, a bootstrapping compiler needs exactly three phases.) That is, they almost certainly expected that GPT-4 would be machine-produced like how Deep Thought created the ultimate computer in a Douglas Adams story. After GPT-3 failed to be it, they aimed at five iterations instead because that sounded like a nice number to give to investors, and GPT-3.5 and GPT-4o are very much responses to an inability to actually manifest that AGI on a VC-friendly timetable.

  • There’s no solid evidence. (You can put away the attorney, Mr. Thiel.) Experts in the field, in a recent series of interviews with Dave Farina, generally agree that somebody must be funding Hossenfelder. Right now she’s associated with the Center for Mathematical Philosophy at LMU Munich; her biography there is pretty funny:

    Sabine’s current research interest focuses on the role of locality and finetuning in theory development. Locality has been widely considered a lost cause in the foundations of quantum mechanics. A basically unexplored way to maintain locality, however, is the idea of superdeterminism, which has more recently also been re-considered under the name “contextuality”. Superdeterminism is widely believed to be finetuned. One of Sabine’s current research topics is to explore whether this belief is justified. The other main avenue she is pursuing is how superdeterminism can be experimentally tested.

    For those not in physics: this is crank shit. To the extent that MCMP funds her at all, they are explicitly pursuing superdeterminism, which is unfalsifiable, unverifiable, doesn’t accord with the web of science, and generally fails to be a serious line of inquiry. Now, does MCMP have enough cash to pay her to make Youtube videos and go on podcasts? We don’t know. So it’s hard to say whether she has funding beyond that.