Not logged in. · Lost password · Register
Forum: agsXMPP RSS
Avatar
zackrspv #1
Member since Sep 2010 · 29 posts
Group memberships: Members
Show profile · Link to this post
Subject: [SOLVED] msg.Html
Alex,

Since you said, in previous posts, that you use WebBrowser controls for HTML input for users, how were you able to get around the noncompliant XHTML?

DO you use AgilityPack, or Tidy?  or SGML?  To parse the HTMl to XHTML?  SO that it can be added properly to the msg.HTML child element?

The reason i ask, is because, right now, the HTML we pass to the server is actually done in encoded code (not even readable), until it is reparsed and recombined on the receiving end (client).  The reason we had to do this is because, at the time, users were creating tons of different HTML elements, etc, but now it's all more standard, and we'd like to actually use the XHTML handling for XMPP.  So that we do not actually have to recode the HTML ourselves.

However, the HTML that the webbrowser control generates isn't valid XHTML (webBrowser1.Document.Body.InnerHtml, for example, doesn't output proper XHTML, colors etc, do not output with quotes, just raw hex (color=#332211) instead of (color="#332211")).

So how did you get around this issue?

--Phillip
This post was edited on 2011-06-27, 23:55 by zackrspv.
Avatar
Alex #2
Member since Feb 2003 · 4307 posts · Location: Germany
Group memberships: Administrators, Members
Show profile · Link to this post
Hi Philipp,

I use only very simple HTML. In my projects I allow the user only to write a complete message in the same font style, color and size. I don't allow bold in between, tables, lists and other stuff, only hyperlinks. I fetch the plain text from the Dom and create the XHTML on my own.

In the XHTML-IM XEP there is also defined only a subset of XHTML:
http://xmpp.org/extensions/xep-0071.html

But as long as your XHTML is valid you can embed it in a message and every server will route it. BUt you should at least have a whitelist of tags then to prevent users sending scripts or other malicious content.

Alex
Avatar
zackrspv #3
Member since Sep 2010 · 29 posts
Group memberships: Members
Show profile · Link to this post
Solved by doing the following:

Custom Class:

public static class MakeXhtml
    {
        public static string ConvertToXhtml(string source)
        {
            var sb = new StringBuilder();
            var stringWriter = new StringWriter(sb);
            var input = source;
            var test = new HtmlAgilityPack.HtmlDocument();
            test.LoadHtml(input);
            test.OptionOutputAsXml = true;
            test.OptionCheckSyntax = true;
            test.OptionFixNestedTags = true;
            test.OptionOutputOptimizeAttributeValues = true;
            test.OptionAutoCloseOnEnd = true;
            test.OptionWriteEmptyNodes = true;
            test.Save(stringWriter);
            return sb.ToString();
        }
    }

Call it like:
var rawHtml = MakeXhtml.ConvertToXhtml(webBrowser1.DocumentText);
XmppSend(rawHtml, webBrowser1.Document.Body.InnerText);

My system now pulls proper XHTML out, and sends it w/o issue.  Nice and easy.  I did make some changes on the client handling of font colors, etc, to ensure that font size isn't affected, but other than that, for the most part, it should handle most html entered into the input webbrowser, and output the proper amount of data.  And, if not, as you are sending the plaintext anyway, just check for invalid XHTML or missing elements, and replace w/ the plaintext version.
Avatar
Alex #4
Member since Feb 2003 · 4307 posts · Location: Germany
Group memberships: Administrators, Members
Show profile · Link to this post
Quote by zackrspv:
And, if not, as you are sending the plaintext anyway, just check for invalid XHTML or missing elements, and replace w/ the plaintext version.
invalid XHTML will result in invalid XML in most cases. When you send invalid XML over the wire the XMPP server will close your session with an error message. This means you should check the result before you send it.

Alex
Avatar
zackrspv #5
Member since Sep 2010 · 29 posts
Group memberships: Members
Show profile · Link to this post
Alex:

I have not seen the disconnect.  If, for example, i send Invalid XML to the msg.HTML element, and try to send the message, the <message> stanza will still be able to be sent, but the .HTML child tag will just be blank <html/>.

--Phillip
Avatar
Alex #6
Member since Feb 2003 · 4307 posts · Location: Germany
Group memberships: Administrators, Members
Show profile · Link to this post
when you build the packets with the agsXMPP DOM classes then agsXMPP takes care of the valid Xml.

Alex
Avatar
zackrspv #7
Member since Sep 2010 · 29 posts
Group memberships: Members
Show profile · Link to this post
That's why i'm not too worried about checking for valid XHTML like that.  As i'm using AGSXMPP to handle it, if i pass invalid HTML to the HTML child element, the AGSXMPP DOM strips it out (almost completely). 

--Phillip
Close Smaller – Larger + Reply to this post:
Verification code: VeriCode Please enter the word from the image into the text field below. (Type the letters only, lower case is okay.)
Smileys: :-) ;-) :-D :-p :blush: :cool: :rolleyes: :huh: :-/ <_< :-( :'( :#: :scared: 8-( :nuts: :-O
Special characters:
Forum: agsXMPP RSS