Numeric Character References don't count against character limits

Mar 27, 2010 22:29


Title
Numeric Character References don't count against character limits

Short, concise description of the idea
Typographic markup wouldn't count against character limits in entries or comments.

Full description of the ideaI like to format my entries to use proper typographical symbols-- for example, smart (curly) quotes vs. dumb (straight) quotes, ( Read more... )

character count, comments, data limitations, § no status, comment creation

Leave a comment

Comments 39

azurelunatic May 17 2010, 04:43:32 UTC
I know that the entry length limit is in bytes, rather than characters, so while it might be possible to replace some entities with the smallest possible equivalent thing, some characters may still take up more than one byte.

But I'm all for being able to condense this into as few as possible bytes.

Reply


lady_angelina May 17 2010, 04:49:35 UTC
I'm not... sure I understand what this suggestion's about. ^^;;

Reply

oldandnewfirm May 17 2010, 04:59:17 UTC
Ha ha. Yes, that's what I meant about the feature not necessarily having a huge appeal. It's really only nerdy designer types who'd notice typographical things that I'd mentioned above. But if you've ever worked with any word processing software before, the correct character sets are ones you've seen all the time but have probably never put much thought into.

Basically, I'd like LJ to automatically turn characters like " " (straight quotes, used for measurement) into “ ” (curly quotes, used for writing) and to turn -- into - (em-dash). Things like that.

Reply

lady_angelina May 17 2010, 05:05:16 UTC
Ahh, thanks! (And see... I use the straight quotes and double-dashes all the time, and actually have the other symbols disabled in Word, so... XD )

If the appeal is limited to a minority set on LJ, then I would think that it might be best compensated for through specialized tags, like the "<" tag for the "lesser-than" < symbol, etc.

But eh, what do I know? XD

EDIT: And, derp! I now realize the point of this suggestion: to cut down on the space required by such tags so that it doesn't break character limit. I get that now. I run into that all the time when editing my userpics comments, especially when crediting someone on LJ.

EDIT v2.0: So... how about if instead of auto-converting these, the HTML cleaner instead just recognizes such tags as only one character as opposed to the four (or more) characters required to construct them?

Reply

azurelunatic May 17 2010, 05:10:16 UTC
Oh, I thought you meant to turn things like “ into “ when it spots it in your entry.

Because of how epically smart-quotes mess with HTML, I'm against that happening completely automatically; I'm one of the sorts who turns the auto-replace feature off directly after installing WordPerfect, because I manage to use quotes in perfectly reasonable ways that always manage to have a large number of the smart-quotes pointed the wrong way.

A button or something would be nice for those who want to use it, but having smart-quotes happen automatically would be an epic dealbreaker for me.

Reply


daluci May 17 2010, 05:06:07 UTC
Heh, the problem is that I know several people who like it the other way. They use -- for one thing and - for another. (Also, I don't much like the smart quotes. ): )

As a possibility - if you preview your comment and paste the preview in, I think it doesn't count as much against your character limit? I'm not... positive on that, though.

Reply

azurelunatic May 17 2010, 05:13:17 UTC
A multi-character HTML entity in multi-character HTML entity format is going to take up as many bytes as it has characters in it -- “ is a 7-byte code, because it uses seven one-byte characters. I suspect that its equivalent, “, might just be two bytes, though I'm not sure. But that's how it would work.

Reply

daluci May 17 2010, 05:19:48 UTC
From testing on my journal, you can post a great deal more -s in a comment if you paste the character itself than if you use —, so it should somewhat work. That's just in a comment, of course. I spend much more time breaking the character-limit on comments than on entries, so I don't know all that much about how the limit on entries works.

Reply

azurelunatic May 17 2010, 05:21:55 UTC
The limit on entries is in bytes (64k worth), so it would work the same way.

Reply


(The comment has been removed)

perlmonger May 17 2010, 12:57:07 UTC
This.

Depending on your OS, you may have a compose character or 3rd level shift function already to type non-keyboard characters in directly, and if not, software addons to do it are available and free (or at least dirt cheap). I've not investigated, but if you're lucky, multi-byte UTF-8 characters might even only count as one in LJ - Twitter, for example, is helpful in that way.

Reply

perlmonger May 17 2010, 12:58:51 UTC
Also, MySQL CHAR and VARCHAR sizes are in characters, not bytes.

Reply


scien May 17 2010, 07:15:22 UTC
I just wanted to say nooo to an auto replace equivalent.

Reply


Leave a comment

Up