Is there a way to do unsafe type coercions?

thautwarm · September 2, 2019, 10:00am

Let the compiler treat an object as a custom specified type. It’s unsafe but it’s necessary for this:

thautwarm · September 2, 2019, 10:49am

Got. unsafe_convert.

yuyichao · September 2, 2019, 10:58am

Err, no. It’s not possible and that function has nothing to do with it. The closest you’ll get is reinterpret.

thautwarm · September 2, 2019, 10:59am

reinterpret cast doesn’t do this for me. I’ll show you the code. In fact I just cast instances into

struct App{TypeCons, TypeArg}
   buf :: NTuple{N, UInt8} where N
end

yuyichao · September 2, 2019, 11:07am

If you were talking about pretending an object is something else then no there’s no way to do it. If you can wrap it then basically any type wrapper will do with whatever tag you want to put in the type parameters.

I’m not sure why you need a tuple there. Is it to hold data for the object? If that’s the case then you should not do that and should just put the actual object there.

thautwarm · September 2, 2019, 11:22am

You’re right. Now I used reinterpret to make an arbitrary unsafe coerce, it works but I doubt if it’ll work well with GC.

I’m not sure why you need a tuple there. Is it to hold data for the object? If that’s the case then you should not do that and should just put the actual object there.

In fact I’ll try storing data as Any as well, and make some comparisons of the performance.
In terms of my use case, I don’t need a real Any, and what I exactly want is allowing 2 identical type representations for some objects.

yuyichao · September 2, 2019, 11:38am

As long as you are not playing with pointers, anything is fine with the GC. If you are not using the builtin reinterpret but is making your own with pointer, then no, you can’t do it.

The point is what do you need for the identity of the original object.

If you need full object identity (by julia semantics) then no, that’s impossible.

If you want pointer identity, first you should ask why you need that since unless you are interacting with C that generally doesn’t matter. This is also generally impossible.

If you want the data to be the same. This is reinterpret and only possible with bitstype. You should also decide why you need the data to be the same. How are you going to use the data? What operation are you going to do on the reinterpreted (re-tagged) data? If after all the fancy dispatch you want to treat the data as the original type then what you want is to restore the original object and what you want is not the data identity. If you want to operate on the data without retriving the original object then what you have is pretty much just a tagged custom data. You don’t have a “original object” anymore and only a “original data”. You can obviously store the data however you like since that’s all what you have anyway.

If you only need to be able to retrive the original object, then by all mean just store it. You have the classic dicision to make if you want to specialize that storage (::Any vs ::T) for performance/dispatch. This is the only way you can do it if you want to support arbitray types.

yuyichao · September 2, 2019, 12:06pm

I don’t know what you want to say here.

I obviously said things so what you said is clearly wrong. (Unless you meant literally that I was typing and not speaking)

I expect all what I said to be logically valid and existing information so in that sense I didn’t say anything new. However, you are asking about doing something in a existing framework so I also don’t expect anything new and discussions generally start with existing knowledge anyway.

Maybe you meant that you know all what I said by heart already. That’s great. You knew everything you want. Since you still want ntuple and also accept all my logic I’ll just assume what you want is tagged custom data rather than retagging existing object then…

thautwarm · September 2, 2019, 12:36pm

Since you still want ntuple and also accept all my logic I’ll just assume what you want is tagged custom data rather than retagging existing object then

You missed the process.

After I realized unsafe_convert is infeasible for my goal, I use reinterpret instead. Literally I was wrong in

Got. unsafe_convert .

What I want is the identity of actual data representations, but when it comes to compiler-awared type representations, a piece of data can have 2 types.

This operation is called %identity in OCaml: types - ocaml %identity function - Stack Overflow , and can be achieved in Haskell with Unsafe.Coerce.Coerce.

yuyichao · September 2, 2019, 2:43pm

So what I said above about

Is answering directly about this, which isn’t nothing. Also, since you are explicitly asking about this I hope you are actually looking for an answer. If this answer somehow seems like nothing and doesn’t satisfy you please explain which part is unclear…

Well, yeah, either with NTuple or reinterpret. Both only works for bitstype and both lost the orignal type information. If that’s what you want that’s totally fine. Still, that’s covered in

which tells you under which condition you’d select this approach and what’s the limitations. I don’t believe it’s nothing.

I have no idea what those are but I’m pretty sure it doesn’t exist literally in julia. That’s why I posted different interpretations and requirements for you to pick which one actually match what you need, which isn’t nothing either.

What you are apparently “using” changed multiple times so I was just listing all possibilities that I could think of. Just to be clear, I didn’t say (apart from the two impossible ones) which one you should take when I listed the possibilities that you said is nothing. I hope that’s already clear since

Matches what you later said

so using that is totally fine (again, as long as you stick with bitstype). If this is nothing new to you that’s good, though I don’t think I’ve ever seen it expressed this way.

Also, just to be clear about the title and the original post, unless you are playing with pointer/C code, there’s nothing unsafe. Unsafe things are ones that will only work if you are careful and can crash if you are not. In this case though, there are only things that are impossible to do/impossible to work or things that should be working and safe. I assume this isn’t news to you but I just want to point this out since the distinction between the two concepts is not always clear in some discussions.

thautwarm · September 2, 2019, 4:20pm

Yes, you’re right. I was angry that time because I thought you’re leading the post to off-topic , and I made some inproper words, that’s my fault.

However if I do want to argue with you I can say I was already using reinterpret and what I expected is to work well with GC(actually this is what I expected from your answer). At that time I pondered a lot, like if I make a pointer by p =Ptr{T}(); unsafe_store!(p, data, 1) how should I keep its lifetime exact as the original data, several ways turned out to be infeasible(e.g., by finalizer, which will cause UBs when data is immutable).

I know this and I just felt I got teased for you’re teaching me here what identity and structural equality are here.

As I said allowing 2 identical type representations for some objects, and “identical” here means “getting treated in the same way in terms of the type representations”. But then you talked a lot about “data identity”, “object identity”, and even custom tagged data. I thought you’re challenging my purpose and approach(even when I’ve posted the paper in the top of the post), felt my words neglected and felt this post was getting off topic.

At that time I was trying to store all objects into the valid pointers(however then it turned out to be infeasible as well) and reinterpret them into a pointer of equally sized Julia type, that’s why I use NTuple. I’m afraid there’s a big performance loss if I stored the object in form of Any.

Additionally, the reason why I think NTuple is faster than Any is, somehow the operations on my data/objects will only get performed when they’re of the origin types. For any type I want to do the unsafe coercions, there’re a pair of coercing methods(inject from origin type to the unique(for each origin type) wrapped one to support some advanced polymorphisms; project from the unique wrapped to the origin). My concern is that, if everything is well typed, the only performance loss will be the reinterpret cast and 8-bytes pointer storing, which is really fast.

I’d like to tell you, I still think it’s “data identity” and “object identity”. For immutable instances(now I’m avoiding use of data here), I think it meaningful when identifying it by its contents, and for mutable instances I think it meaningful when identifying it by its reference/address. I’ve mentioned my means about emulating my expected coercing, and if I succeed for immuatable instances they’ll be “data identity” and for “mutable data” they’ll be “object identity”. I was trying to emulate a type system and implement it in Julia, and once I finish it, without digging into my library, for any case, Vector{T} where T and App{Vect, T} where T will hold the same information(each holds exact information to get coerced to the other) and shouldn’t perform in different way in my emulated system. You could say, “in Julia somehow(===) they do can get distinguished from each other”, but I think it’s valid to say “they’re identical” if we don’t make our discussion reach here.

Sorry, your words are so correct, and I think my original idea about achieving unsafe coercing is correct, too(although it turned out to be infeasible for storing the instances of some types into pointers is not allowed). I don’t want to stress that I know this.

The reason why I said “nothing” is, I just felt you’re impolite. Yes my behaviour is ridiculous, but I do felt you’re usually offensive in your replies to many of my posts/comments. I shouldn’t be impolite and say that, and I’d like to say sorry again to you, “sorry, that’s my fault”. However I did often feel offended by you.

I don’t have time to continue writing this long long reply or check if you previous relies are really so offensive, maybe just due to my mental problems. To be honest, if you did reply to me with any malice, please stop, I’m so unhappy.

yuyichao · September 2, 2019, 6:31pm

Well, you do realized that the paper you post is 17 pages long and 15 excluding the reference right? Also the paper seems to assume some knowledge about other languages that are not assumed here. In this case, “your word” is simply pasting the link to a very long paper and I don’t think that’s too much work. I don’t necessarily think that should be ignored and I actually did read a few paragraph about it to get enough of an idea of what you meant by “unsafe type coercions”, which I believe is still roughly right. However, it was unclear, and still unclear now, what property for that operation you are looking for. I’m not challenging you but I am indeed asking you about your requirement in order to determine which approach is correct.

OK, well, if this is the kind of wording that you are offended by then I’m feeling sorry for you. I see nothing wrong with repeating simple things since,

I don’t know if you know that or not. If you do, that’s great. If you don’t, then it served a purpose.
This is how logic works and I’m only stating the facts that are related. It’s literally impossible to conclude anything if you didn’t start somewhere that’s known already.
Related to previous two, it’s very dangerous to not state your assumption and conditions. Countless discussions happens here (and elsewhere) simply because the initial assumption doesn’t match. Those discussions are always useless without talking about the assumptions they are built on. Also, even if the assumption isn’t the issue, stating and agreeing on it will greatly help identifying where the actual disagreement is.

Now it makes sense why you seems to be offended easily by my post sometimes. Yes, I state simple, basic facts very often. However, that’s also not something I will change.

Accepted and no problem. At least I hope you understand why I talked though it the way I did now.

And yes, I agree those may not be very clear and I welcome/was expecting comment on that since I really couldn’t find better word to describe it. I’ll try another way to see if it’s better this time… (And just FYI the following will start from logic as basic as I can think of but I’ve already explained above why I do this. FWIW, in this case, I’m asking you to correct my logic since you’ve certainly know the paper very well but I don’t.)

What I meant is that, you have an original object o and you want to give it a type T to become an object n. Obviously you want n to be “identical” to o and I’m asking what does identical mean here. I did not see a very obvious mentioning of this in the linked paper but I’m assuming that this depends on how you are going to use n later. You "cast"ed o to n so you obviously want to use it as o sometime later (or you are just doing a conversion, basically) but the question is, how? Do you need o to be of the original type at that time, or do you just need the data.

Here’s where I saw something that does’t fit. You’ve said that,

together with your use of either reinterpret or NTuple, suggest that you only need the data but may or may not need the type to be the same when you use it as the original type, i.e. the user code doesn’t rely on the type information and just the data. That’s totally fine up to now, albeit a bit unexpected for me, but you’ve just said that,

which is actually exactly what I expected. Therefore, I expect your tagged object n to hold a reference to the old type (e.g. NewType{Tag, OriginalType}) which you’ll convert it back to o::OriginalType before using (please correct me if this assumption is wrong). That’s why I believe it should be easier for you to simply

Note that this does not imply ::Any as the field. It can be specialized, and

At the risk of stating the obvious, this is when you need to decide whether a field should be of an abstract type or if the type should be parametrized, i.e. struct AbstractField a end vs struct ParamField{T} a::T end. (o::AbstractField).a will be ::Any (abstract) and (o::ParamField{T}).a will be ::T (possibly concrete) but (a::Vector{AbstractField})[1] will be concreate and (a::Vector{Paramfield})[1] won’t.

This is julia’s definition of object identity and it seems that you just want the new object to contain all the information about the original object. In another word, your new object needs to contain the old one and you still only need to retrive the original object so what I just said above should still apply.

For completeness, I have to say I don’t think I fully understand what you want to say here, likely because I didn’t read the whole paper. I feel like these aren’t essential given the way you said you’ll use your objects but I’d like to mention this in case this is actually very important/you felt like I’m ignoring you again.

yuyichao · September 2, 2019, 6:48pm

Oh, and since you mentioned emulated system (sadly in the part I don’t fully understand), that’s what I mean by custom data, where custom here means data that’s only meaningful for the user/you and julia doesn’t care about. (It’s a bad word but I really can’t think of a better one… maybe black box? though that sounds somewhat wrong too…).

What I’m saying in this context (or an example of it) is basically, if you are emulating a whole new object system (one that doesn’t have to convert from/to julia object), then you pretty much only need some arbitrary bytes and you don’t need to care at all about holding the original julia object. Then you can certainly

My feeling is that this is not what you are doing (since you want it to work on Vector{T}) but I’m not 100% sure.

thautwarm · September 2, 2019, 7:00pm

My post was flagged as inappropriate again. It’s okay for the first time because I was impolite at that time.

I’m sorry.

Topic		Replies	Views
`reinterpret` to a single value from an array of a smaller data type General Usage	24	3164	March 26, 2018
Wrapping a C lib, question on the safety of convert that uses unsafe_convert to retrieve a pointer General Usage	13	806	November 9, 2018
FAQ: ReinterpretArray vs unsafe_wrap General Usage	2	1719	September 4, 2018
Is this bitcast function sane and safe? General Usage question	9	1359	October 21, 2019
How to use ccall, cconvert, and unsafe_convert in a convenient and memory-safe way? General Usage ccall , garbage-collection	18	2351	June 24, 2020

Is there a way to do unsafe type coercions?

Related topics